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CHAPTER  1 


ANALYSIS  OF  DELAY  ESTIMATION  IMPROVEMENT  FACTORS 
DUE  TO  MULTIPLE  MEASUREMENTS  AND  A  PRIORI  INFORMATION 

Abstract 

Signals  from  distant  sources  produce  intersensor  delays  in  towed  arrays 
which  may  have  their  measurement  variances  improved  when  all  possible  sensor- 
pair  measurements  and  a-priorl  knowledge  of  range  are  taken  into  account. 
Improvement  factors  are  defined  and  plotted  vs.  signal-to-noise  ratio  and  vs. 
number  of  sensors  in  the  arrays.  Clustered  vs.  equally  spaced  sensors  is 
another  aspect  investigated.  Analytical  results  are  obtained  for  three 
sensors . 


i 

i 
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I.  INTRODUCTION 


This  paper  intends  to  show  the  improvement  in  noisy  measurements  of 

passive  sonar  delays  by  using  a  priori  range  information  and  an  increased 

number  of  sensors.  The  problem  is  cast  into  state  variable  form  with  the 

sonar  delays  as  states.  Kalman  filtering  is  used  to  improve,  through 

a  priori  information  and  multiple  measurements,  the  delays  to  be  estimated. 

The  extent  of  improvement  can  be  examined  by  noting  the  decrease  in  the 
sonar  delay  estimate  variances  after  one  set  of  measurements  is  taken  and 
£  priori  information  is  taken  into  account. 

The  estimation  problem  is  investigated  first  for  three  sensors  and 
later  expanded  to  the  multiple  sensor  case.  For  all  cases,  the  sensors 
are  considered  to  be  perfectly  in  line,  as  in  an  ideal,  towed  array. 

When  they  are  clustered,  all  sensors  in  a  cluster  are  assumed  to  share 
the  same  spatial  point.  Other  assumptions  are  that  the  signal  spectrum 
is  the  same  at  every  sensor  and  the  noise  spectra  are  also  the  same,  but 
independent;  that  is,  the  noise  is  independent  and  identically  distributed 
at  each  sensor. 

With  three  passive  sensors  it  is  possible  to  use  the  measured  intersensor 
delays  for  estimating  the  bearing  and  range  of  a  signal  source.  The  measure¬ 
ments  are  often  provided  via  generalized  crosscorrelator  [1,6].  The 
measurements  z_  of  the  delays  d  are  generally  in  error;  however,  a  priori 
information  may  improve  that  error. 

Consider  the  sensor  array,  delays  and  measurements  depicted  in 


Figure  1. 


o 


o 


Figure  1.  Sensor  erray,  delays 
end  d^.  measurements  z^,  z^,  z^ 


The  relationship  between  the  measurements  and  the  delays  is  given 


where 


z  ■  H  d  +  e 


n::  r 


d2  /  ’  — 


and  e  is  the  error  vector.  The  crosscorrelator  measurement  covariance  is 


cov(e)  ■  R  •  r 
—  —  o 


,/1Y~Y 

7  Y  i  Y 

l-Y  y  i 


where  the  sign  of  y,  0<y<  0.5,  is  due  to  common  signals  in  the  cross¬ 
correlations  of  signals  from  members  of  sensor  pairs.  That  is,  if  there 
is  a  common  signal  or  sensor  in  two  delay  estimates,  y  has  the  positive 
sign  if  the  common  signal  is  used  first  in  both  delays  or  second  in  both; 
but  if  the  common  signal  is  used  first  in  one  delay  and  second  in  the  other 
the  sign  is  negative  [1].  For  large  signal-to-nolse  ratios  (SNR)  it  may 
be  seen  from  results  in  [1]  that  y  -*■  1/2.  Similarly  for  SNR  -►  0,  y  0. 
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r*. 
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Using  subscripts  ij  to  indicate  from  i  to  j,  it  is  clear  that  d._  * 


+  &23*  **ut  h°w  measurements  of  any  of  these  effect  estimation  of  others 


has  not  been  clear.  In  [2]  it  is  shown,  for  example,  that  at  low  SNR, 


measurement  error  variance  in  can  be  reduced  by  a  factor  of  2/3  by  use 


of  and  z^.  Further  in  [2],  a  priori  knowledge  of  source  (target)  range 


is  shown  to  give  an  improvement  factor  of  1/6  at  low  SNR.  The  effect  of 


redundant  measurements  and  a  priori  knowledge  is  next  analyzed  for  improvement 


over  measurement  error  covariance  in  z^,  which  improvement  was  not  explored 


in  [2],  due  to  that  particular  assignment  of  states.  Nor  was  relative 


improvement  with  number  of  sensors,  M,  considered. 


Now  let  the  a  priori  uncertainty  matrix,  cov  d  ,  be 


The  parameter  8  is  a  weight  corresponding  roughly  to  knowledge  of  target 


range.  For  infinite  range,  8*2,  because  any  a  priori  delay  estimation 


error  in  d^(0)  necessarily  is  twice  as  large  in  d2(0);  thus  cov  (d^(0). 


<*2(0))  ■  2  var  d^(0).  However,  when  range  is  totally  uncertain,  a  priori 


errors  are  in  no  way  correlated,  and  8  may  be  set  to  zero.  In  any  case,  8*2 


for  all  reasonable  ranges  (range  much  greater  than  sensor  array  length) . 


Thus  the  choice  of  0q  (4  +  a)  is  also  reasonable  for  ?22^ 


If  a  is  chosen  assuming  uniformly  distributed  bearing,  then 


2  2 

a  *  (L/v)  /2  where  L  is  the  sensor  spacing  and  v  is  the  velocity  of 


sound  in  the  medium,  aQ  is  the  mean  square  value  of  d^  -  (L/v)  sin  0, 
where  6  is  the  counterclockwise  angle  from  broadside. 


Measurements  of  <i  may  be  Improved  due  to  both  the  dependence  among 

2 

the  z_  elements  and  the  a  priori  Information  through  oq  and  S*  The 
correlator  measurement  variances  r^  and  r^  are  effectively  reduced  according 
to  Kalman  estimation  theory  [3]  which  yields  improved  or  updated  estimates 


of  d. 


where 


"  X  -1 

dx  -  PL  HRZ 


-1  T  -1  -1 

(P  +  H  R  H)  A 


cov(d^) 


The  a  priori  d,  d^,  has  been  assumed  to  be  the  zero  vector,  and  d  is 
assumed  a  constant  as  for  a  nonmoving  signal  source.  Extention  to  more 
general  situations  is  straight-forward  [2,  3].  P  is  the  new  estimate 
covariance  matrix.  Thus  its  element  p^(l),  f°r  example,  is  improved 
(smaller)  from  p^CO)  and  also  The  ratios 

P,,tt>  2 

*1  ’  3  Pll(1)/ro 


p22(1) 


p22(l,/ro 


are  the  improvement  factors  which  are  of  interest.  They  indicate  improvement 
over  the  correlator  measurements. 

II.  THREE  SENSOR  ANALYSIS 
2  2 

n,  and  are  functions  of  a  ,  r  ,  y,  $,  and  a.  This  may  mean 
J-  i  oo 

rather  complicated  expressions,  but  under  certain  simplifying  conditions 
they'are  easily  understood.  For  three  sensors,  the  nondimens ional  improvement 
factors  are 


(8) 


_ ± _ + 

(o  2/r  2)(4-62)  Y+l 

_  _  o  o _ 

^1 

1  .  2(5-6) _  .  3 _ 

(a  2/r  2)2(4-32)  (a  2/r  2)(4-B2)(y+1)  (Y+l)2 

0  0  0  0 

and 

(ao2/ro2)(4-S2)  +  Y+l 

f)  m  - 

1 _  ,  2(5-8) _  ,  3 

(a2/ r  2)2(4-82)  (a  2/r  2)(4-82)(y+1)  (Y+l)2 

O  O  0  o 


Next  che  a  priori  conditions  8*0  and  8*2  correspond  to  completely 

uncertain  range  and  infinite  range  respectively.  Letting  y  “  0  or  1/2 

corresponds  to  SNR  ■  0  or  infinity  respectively.  The  functions  or  n2 

are  given  in  Table  1  for  these  four  conditions. 

2 

The  table  shows  that  for  even  infinite  Oq  or  complete  a  priori  uncer- 

tainty,  improvement  is  made  in  d^  due  to  the  fact  that  z^  contributes  more 

2  2 

information.  That  is,  for  a  «  r  is  reduced  by  n, ,  ranging  from  1/6 

oo  l 

(infinite  range,  SNR+0)  to  1/4  (infinite  range,  SNR.-*00)  to  2/3  (uncertain  range, 

SNR-K)) .  Only  when  range  is  uncertain  and  SNR-*»  is  there  no  improvement.  These 
2  2 

data  (ro  /Oq  -*■  0)  show  that  both  the  dependent  measurements  and  a  priori 

knowledge  of  Infinite  or  large  range  contribute  to  Improvements. 

2  2  2 

Analyzing  ru  as  r  /4a  -+0  shows  improvements  in  r  to  p0-(l)  are  only 

L  O  O  O  LL 

possible  for  SNR-K),  regardless  of  a  priori  knowledge  of  range  (8).  For  both 
8  ■  0  or  8  ■  2  the  improvement  is  by  the  factor  n2  *  2/3.  From  (5),  using 
(8)  and  (9),  the  improved  delay  estimates  are 


asymptotic 
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for  d.;  o0‘  is  spriori  variance  for  d. 

1  o 

(apriori  uncertainty),  4o  is  apriori 
variance  for  d„.  ° 
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where 


2(5-6) _ 

(a  2/r  2) (4-82) (Y+l) 
o  o 


(11a) 


(Y+l) 


(a  2/r  2)2(4-62) 

o  o 


a  -  (a  2/r  2)(4-82) 
o  o 


2  2 

Again  under  the  condition  8<2,  and  letting  r  /a  -K)  (no  a  priori 

o  o  —  . 

knowledge  o£  d) , 

/  d,  \  / 2/3  1/3  -1/3  \  /  z,  \ 


(lib) 


2/3 

1/3 

-1/3 

1/3 

2/3 

1/3 

2  /  ,  0<B<2. 


2  2 

When  8*2,  then  the  estimates  become  for  r  /a  *  0, 

o  o 


1/6 

2/6 

1/6 

2/6 

4/6 

2/6 

)  ti\ . 
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One  conclusion  is  that  if  all  measurement  variances  are  equal,  and  there  is 

n0  ®  priori  knowledge  of  d,  knowledge  that  range  is  infinite  will  improve 

the  error  in  the  measurement  of  The  improvement  is  due  to  interdependence 

among  the  measurements  and  is  independent  of  SNR.  Another  conclusion  is  that 

2 

if  anything  is  known  of  d  a  priori,  Oq  is  finite,  and  measurement  errors  may 
be  improved  with  this  knowledge  also,  independent  of  SNR  and  a  priori  range 
(8)  knowledge. 
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III.  MULTI-SENSOR  DELAY  ESTIMATION  GIVEN  MEASUREMENT  AND 
A  PRIORI  COVARIANCES 


For  optimal  range  and  bearing  estimation,  the  best  array  configuration 
[5]  for  M  sensors  is  three  clusters  of  M/3  sensors  each  and  equal  spacing  L 
between  groups,  as  shown  in  Figure  2. 


ooo 


ooo 


ooo 


Equally  Spaced  Clusters 

Figure  2.  Multi-sensor  Sonar  Array  Configuration. 


All  sensors  in  a  "pod"  are  assumed  to  be  in  the  same  location,  i.e., 
there  is  no  delay  between  sensors  in  the  same  group.  The  noise  spectrum 
for  each  sensor  is  assumed  to  be  the  same,  but  independent  from  sensor  to 
sensor.  With  more  sensors,  more  delay  measurements  are  possible,  and  the 
increased  number  of  measurements  of  d^  and  d^  should  improve  the  Kalman 
estimate  as  indicated  in  Eq.  6.  We  show,  however,  that  the  amount  of  improve¬ 
ment  available  depends  on  the  sensors'  locations.  With  three  clusters,  there 
2 

are  only  (M/3)  measurements  instead  of  M(M-l)/2.  That  is  because  there  are 
no  measurements  between  sensors  in  the  same  cluster.  Even  though  there  are 

shM 

fewer  measurements,  a  priori  information^produce#  more  improvement,  and  it 
does.  Also  the  (M/3)  measurements  are  directly  end-to-center  or  end-to-end 
delays,  whereas  the  M(M-l)/2  measurements  must  be  linearly  combined  to  esti¬ 
mate  d^  or  d2> 


'  T?.1 


Many  examples  'from  numerical  calculation  are  available  in  [4] ,  wherein 

are  many  contour  plots  of  n  for  y  vs  8.  For  all  numerical  results,  Q  is  the 

2  2 
ratio  r  fa  . 
o  o 

The  n  vs.  M  plots  for  fixed  y  and  B  give  excellent  insight  as  to  the 

A  /X 

improvement  of  and  d^  by  increasing  the  number  of  simultaneous  delay 

measurements  taken.  In  general,  there  is  always  improvement  in  the  estimate 

(decrease  of  r^)  as  M  increases.  In  every  case,  the  order  of  improvement 

(least  to  most)  is  Q*0.1,  Qal,  then  Q*10.  0*0.1  implies  good  measurements 

2  2  2 

(Tq  <<0q  )>  and  therefore  r^  is  small.  It  is  more  difficult  to  improve  on 
measurements  when  they  are  already  very  good  (Q-0.1)  than  when  they  are 
(relative  to  a  priori)  more  highly  noise  contaminated  (Q-10) . 

Figure  3  for  three  and  nine  sensors  show  the  amount  of  improvement 
available  using  nine  sensors  over  using  just  three  sensors.  Regardless  of 
SNR  (indicated  through  y) ,  the  amount  of  additional  improvement  is  substantial. 

Curves  similar  to  those  of  Figure  3  were  plotted  for  sensors  equally 
spaced  along  the  line  array.  These  data.  Figure  4,  show  less  improvement  than 
that  of  the  clustered  sensors  as  M  increases  from  3  to  9. 

CONCLUSIONS 

This  paper  has  examined  the  effect  of  using  a  priori  information  and 
multiple  measurements  on  delay  estimate  variances.  The  estimate  variances 
after  one  set  of  measurements  is  taken  are  scaled  by  the  measurement  of  the 
delays  in  question.  Following  is  a  quick  summary  of  the  major  findings  in 
this  report  and  a  short  description  of  these  results  is  recorded  in  Table  II. 

1.  The  estimate  of  Improves  greatly  due  to  knowledge  that  range  is  much 

A 

greater  than  the  sensor  group  spacing;  however,  d2  does  not  improve  over 
measurement  by  range  information. 


2.  Improvement  factors  are  plotted  for  three  and  nine  sensors  equally 
spaced  rather  than  in  clusters.  Clustering  allows  even  greater  im- 
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OPTIMAL  DELAY  ESTIMATION  IN  A  MULTIPLE 


SENSOR  ARRAY  HAVING  SPATIALLY  CORRELATED  NOISE 


Abstract 


The  maximal  likelihood  (ML)  estimation  of  time-of-arrival  differences  for 


signals  from  a  single  source  or  target  arriving  at  M  2  2  sensors  has  been  the 


subject  of  a  large  number  of  papers  in  recent  years.  These  time  differences 


or  delays  enable  target  location.  Nearly  all  previous  work  has  assumed  noises 


which  are  independent  among  all  sensors.  Herein  noises  are  taken  to  have 


complex  correlation  between  sensors.  A  set  of  nonlinear  equations  in  the 


unknown  delays  is  derived.  The  Fisher  information  matrix  (FIM)  for  the 


estimates  is  also  derived.  The  Cramer  Rao  Matrix  Bound  (CRMB) ,  which  is  the 


inverse  of  FIM,  shows  optimal  estimator  covariances.  Computer  evaluations  are 


given  for  CRMB  elements  with  varied  SNR  and  noise  covariance  values  typical  of 


turbulent  boundary  layer  noise  in  towed  arrays. 
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OPTIMAL  DELAY  ESTIMATION  IN  A  MULTIPLE 
SENSOR  ARRAY  HAVING  SPATIALLY  CORRELATED  NOISE* 

I.  Introduction 

The  estimation  of  time-of-arrival  differences  for  signals  from  a 
single  source  or  target  arriving  at  multiple  sensors  has  been  the  subject 
of  a  considerable  number  of  papers  in  recent  years.  These  time  delay 
differences,  or  simply  delays,  enable  target  localization  through  straight¬ 
forward  geometrical  considerations  when  the  signal  path  is  non-dlspersive 
[1,2].  Although  target  location  is  the  primary  goal,  delay  estimation  is 
essentially  equivalent  as  there  is  a  one-one*,  although  nonlinear,  relation 
between  the  maximum-likelihood  (ML)  delay  vector  and  ML  location  vector. 

Essentially  all  of  the  results  of  available  literature  (except  [9])  have 
been  based  not  only  upon  the  geometric  and  non-dispersive  assumptions  stated 
above  but  also  upon  noise  spectra  which  are  independent  among  sensors.  The 
independent  noise  assumption  is  adequate  if  either  the  sensor  self-noise  is 
dominant  or  the  sensors  are  spatially  separated  sufficiently  such  that  the 
environmental  noise  is  indeed  independent  or  uncorrelated  among  sensors. 
However,  this  is  not  always  a  reasonable  assumption  and  the  effects  of 
spatially  correlated  noise  in  the  estimation  of  delays  and  delay  variances 
must  be  considered.  Thus  appropriate  analyses  are  herein  undertaken  to 
consider  correlated  noise.  Results  are  compared  to  those  previously  published 
for  independent  sensor  noises. 

Owsley  end  Fay  [11]  have  considered  correlated  noise  when  clustering 
sensors  and  optimizing  beamformers.  The  comparable  optimization  of  delay 
estimation  has  not  previously  been  approached.  By  choosing  the  correlation 
parameter  p,  we  may  include  the  proportionality  of  correlated  turbulent 
boundary  layer  tow-noise  and  isotropic  sensor  noise. 

The  basic  approach  is  to  assume  that  complex  Fourier  coefficients  X^(k) 
at  the  i—  sensor  for  the  k^  frequency  are  available,  having  been  obtained 
from  T-second  time  records,  where  T  is  long  with  respect  to  the  signal 
correlation  time. 


*  For  an  array  with  three  sensors  in  line  there  is  an  ambiguity  in  the 
sign  of  bearing  angle,  which  we  assume  may  be  solved  with  additional 
information. 


+  This  study  funded  under  office  of  Naval  Research,  contract  number 
N00014-82-K-0048. 
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A 


''■’SWV 


The  time  data  records  are 


x^t)  -  sCt-d^  +  ni(t),  i  ■  1,2,..  .M. 


where  di  are  the  delays  from  the  reference  sensor  to  the  i—  sensor 


(d^  *  0),  s(t)  is  a  zero-mean,  Gaussian,  stationary  signal,  and  n^(t)  is  the 


additive  Gaussian  noise  at  the  i —  sensor. 


II .  Background 


The  problem  of  delay  vector  estimation  for  multiple  sensors  has  been 
studied  with  the  above  approach  in  original  papers  by  Hahn  and  Tretter  [3], 

Hahn  [4]  and  Schultheiss  [5].  Closely  following  their  presentations,  let 

1  1/2 

X.  (k)  -  —  x.(t)  exp{-jku)  t}dt,  k  *  1,2,...,K,  (2) 

1  1  i-T/2  1  ° 


where  co  *  27r/T.  Define  a  vector  X  containing  the  above  MK  Fourier 


coefficients  as  elements.  If  S(w)  and  N^(co)  are  the  signal  and  noise 


spectra  at  the  i —  sensor,  the  probability  density  for  x  can  be  written 


K  K 

p(X)  -(***11  det  tKk))"1  exp [ -£  XT(k)R-1(k)X*(k)  ] 
k-1  k-1 


where 


X(k)  -  [Xj^Ck) ,  X2(k), - ^(k)] 


X-  [XT(1);  XT(2), - ,XT(K)]T 


V(k)  -  [1,  exp(-jku>od2) , - .expC-jku)^)  ] 


N(k)  *  [N^j(k)],  an  MxM  matrix  of  noise 


cross-power  spectra 


R(k)  -  N(k)  +  S(k)  V*(k)V  (k) 


and  where  *  superscript  denotes  complex  conjugation. 

In  order  to  obtain  the  ML  estimate  of  delays,  determinant  and  inverse 


theorems  of  use  are 


|r|  -  |n  +  sv*vT|  -  |n||i  +  n“1sv*v1 


M | ( 1  +  SVTN-1V*) 


'  *  v.r<1  >  t  *''/V  -.*>r  1 U  trfi’r* 


1^,  i'1,*  *,!  < 


v.>vT0»y«-< 


and 


R_1  -  N_1  -  N-1  V*(VTN_1V*  +  1/S)-1  VTN-1  (5) 

-1  ik 

Defining  elements  of  N  as  N  »  the  likelihood  function  of  the  delay  vector 
T 

D  -  (d2,  d^,...,  is,  using  (4), 

A  -  In  p(X)  ■  -Jln^**)  -  l  £n[  ]  N|  (1  +  SVTN-1V*)] 

B+ 

-  I  [XTN-1X*  -  XTN~1V*VTN~1X*1 

B+  VTN-1V*  4-  l/s 

where  £  means  sum  over  positive  frequencies. 

B+ 

Hahn  and  Tretter  [3]  have  shown  that,  when  N  is  diagonal,  [N^,^. . .  ,NM] , 

the  Fisher  information  matrix  for  D  (FIM  ■  -  <grad<grad  in  A>>  where  <•>  is 
expected  value)  is 

2 

FIM  -  l  2u)2  - - -  [  (tr  N-1)N  -1-  N  -111TN  -1]  (7) 

B+  1  +  l  S/f^  P  P  P 

where  N  1  is  N-1  with  the  first  row  and  column  removed.  The  Cramer-Rao 
P 

Matrix  bound  (CRMB)  for  the  delays  D  is  (FIM)  1.  The  ML  estimate  covariance 
is  known  to  asymptotically  approach  the  CRMB.  The  ML  estimate  for  small 

A 

delays  (D  is  the  error  when  D  -  0)  and  independent  noise  is 

D  =■  -<C>-1BT,  (8) 

where 

<C>-1  -  FIM,  (9) 

B  -  7  ju)  - - -  1TN-1[XX  *T-X*X  T]N  -1,  (10) 

B+  1  +  Is/Nt  P  P  P 

And  x  is  X(k)  with  the  first  element  (X. (k))  removed. 

P  1 

Hahn  and  Tretter  also  show  that  the  ML  D  estimate  can  be  implemented 
either  as  a  beamformer  (ideally  in  real  time  only  when  the  are  propor¬ 
tional,  because  of  phase-matching  filter  criteria),  or  as  a  cross  correlator 


2-3 


system  which  produces  the  M(M-l)/2  delay  estimates.  The  correlator  system 
has  cross-spectral  filters 


s /n  n 

I*!,!2 - P—  .  1,  i  -  1.  2 . M,  (11) 

1  1  +  [  s/\  iti. 

The  error  covariance  matrix  for  the  pair-wise  delay  estimates  of  the  cor¬ 
relators  is  shown  by  Hahn  [4]  to  have  elements 


covar(d^,  d^)  *  0,  i,  j,  k,  i  all  distinct 


A  2-rr 

covar(dirdu)  =  — 


B  *  1  Fi,1 


U' 


SNidu 


i  +  l 


(13) 


B  “2|rij|2sd“  1 8  “2|FulMM 


It  is  emphasized  that  these  are  correlator  error  covariances  of  the  d 


-  covar  (d^,  du) 


ij 


and  not  ML  estimator  error  variances,  which  are  derived  herein. 

The  delays  having  covariance  matrix  defined  by  (12)  and  (13)  are  not 
the  M-l  delays  referred  to  a  single  sensor.  Hahn  and  Tretter  have  shown  how 
to  use  weighted  linear  combinations  of  the  M(M-l)/2  cross  correlation  delay 
estimates,  d^,  to  form  an  estimate  for  D  »  (d^)  which  achieves  the  CRMB  of 
(7). 


With  independent  noises  maximization  of  A  in  (6)  over  the  vector  D 
concentrates  on  the  second  term  in  the  second  summation,  because  other  terms 
are  not  dependent  on  the  d^.  This  is  not  generally  the  case,  and  an  analytical 
solution  is  not  available,  as  was  pointed  out  in  the  multipath  analysis 
given  by  Owsley  [6].  However,  the  generation  of  a  set  of  nonlinear  equations 
in  the  unknowns  d^  may  be  obtained. 

In  the  next  section  ML  estimator  equations  for  the  M(M-l)/2  pairwise 
delays  are  derived.  Section  IV  produces  the  CRMB  for  a  subset  of  these  having 
M-l  independent  delays.  Section  V  considers  the  M-l  delays  dj^  -  d^  and 
Section  VI  derives  their  CRMB. 


*9 


& 


u 

a 


III.  Estimation  of  an  Independent  Subeet  of  the  M(M-l)/2  Delays 

This  section  will  determine  equations  for  ML  estimates  of  an  independent 
subset  of  the  M(M-l)/2  delays  *  d^.  In  contrast  most  other  papers 

referenced  find  ML  estimates  of  either  the  M-l  delays  (d^  -  d^) ,  2  <  i  <  M, 
or  other  parameters  such  as  range  and  bearing,  functions  of  which  the  d^  may 
be  written.  The  reason  for  our  choosing  the  d^  is  that  a-priorl  information 
about  linear  relationships  among  them  may  subsequently  be  used  as  in  [10] 

A 

to  improve  the  delay  estimates  d^  -  d,  or  any  other  subset. 

Consider  the  two  summations  in  (6),  the  only  functions  of  D.  He  would 

/v 

like  to  solve  for  the  d..  which  maximizes 

ik 

A,  -  j  i-An(g)  +  xV-Vv .y.1*.*  ,  (14) 

x  B+  8 

g  -  1/S  +  VTN-1V*  *  1/S  +  y  y  (cos  axl  Re{NPq> 

p  q>p  F 

+  sin  oxi  I  {Npq})  (15) 

pq  m 

Differentiating  with  respect  to  d^  (assuming  all  d^  independent) 
and  setting  this  equal  to  zero  and  rearranging  gives 


3(VTN~1V* 


-  +  xtn_1  u  n_1x*}-o 

o 


The  matrix  U(i,k)  *  (u  (i,k))  has  typical  elements  with  values  which  differ 

mn 

according  to  whether  or  not  (m,n)  *  (i,k)  or  (k,i).  These  are  found  to  be 


mn  g 


e^ik  (ju)  -  gj^/g) 


e'J^ik  (_jw  _  g^/g) 


joxi  , 

-eJ  mn  g^g 


* (m,n)*(i,k) 


,(m,n)-(k,i) 


, (m,n)^(i,k) ,  (k,i) 


Insertion  of  (17)  into  (16)  constitutes  (M-l)  equations,  nonlinear,  to  be 
solved  for  the  d^.  Note  that  only  M-l  delays  can  be  independent.  He  now 
turn  our  attention  to  the  CRMB. 


s/v.  §v ■- .  / r' 


VV  V 


'..VvV,';  v. 


^ - 


Ve  "I  A-y  v , *_  , r  ■>• •jV'tfC  '4  * *L_ 


.  >.  /.  *  4  «r>  , 


I 
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IV.  The  Cramer-Rao  Bound  for  an  Itxdependent  Subset  of  the  M(M-l)/2  Delay 
Estimates  of-  d^ 

As  is  well  known,  maximum  likelihood  estimators  have  variances  which 

A 

approach  the  Cramer-Rao  bound.  The  variance  bounds  for  the  are  the 
elements  in  the  diagonal  of 

-1  T 

CRMB  -  (FIM)  -  (-< grad (grad  £n  A^  »  , 

wherein  FIM  is  the  Fisher  Information  Matrix,  grad  A.  is  a  row  vector 

th  ^  ^ 

whose  m-^  element  is  the  derivative  of  A1  with  respect  to  the  nr=^-  delay 

th  ^ 

(the  nr-*1  d^  here),  A^  is  the  expression  in  Eq.  (14),  and  <•>  denotes 
expectation.  The  outer  gradient  operator  creates  a  matrix  whose  elements 

3  9A1 

are  ^  *  We  ^ave  alrea(*y  found  the  inner  partial  —  the  result 

rt  “ik 

is  Eq.  (16).  For  any  M-l  independent  delays  the  following  applies. 

Taking  and  negating  the  second  partial  with  respect  to  drt  gives 


l  ~  <812  -  g^7*  *  “  l  xVW^* 


where 


*1  3d 


s2  3d 


—  -  2w(-Re{Nilt}sina)d.,  +  Im(Nik}cosuxl ..  ) 


2w(-Re{Nrt}sinwd  +  Im{Nrt}cosu)d  ) 
rt  rt 


and  where  b  has  the  following  values  (letting  U  *  0  if  (r,t)  f  (i,k), 
mn 

4  -  1  if  (r,t)  -  (i,k) ) 


ik  '  g 
bU  ■  blk* 


”®2  2  *12  *1  *1*2  loud 

(  —  Jw  -  U(u  +  —  +  —  jw)  +  2-~^-)eJC"ik 


(22a) 


g  g 


brt  f3“  +  ^ 
© 


b  -  b  * 
tr  rt 


.  ;  (r,t)  ^(i,k) 

1*1*2 


(22b) 


T/r 


Using 


<X  X  *> 
r  t 


(23) 


1 


Se“iujdrt  +  N*t  ,  r  t 
S  +  N,  .  r  -  t 


Che  elements  of  the  FIM  are 


(FIM) 


rt.ik 


|+  g  (s12  '  «l*2/8) 


o'**  in 

+  2  l  ( (Scoswd  +  Re{N*  })  ReCN^N01*^} 
qjhn  m<*  mcl 

-  (-Ssintod  +  Im{N*  })  Im{NmmNmq}) 
mq  mq  ' 

+  2  l  l  (('Scosaxi  +  Re{N*  })  Re{NpmNmq} 
p?hn  q>p  pt* 

-(-Ssinuipq  +  Im{Npq})  rm{NpmNmq}) 


+  y 

pT*m 


|  N^m  |  ^  (Scosu  d 

pm 


+  Re(S*  })] 

pm 


+  2  ll 

m  n>m 


[Re{b  }  <  Re{G(m,n)}> 
mn 


-  Im{b  }  <  Im{G(m,n)}  >]} 
mn 


where 

<Re{G(m,n)}>  -  j  £  (Re{NpmNnq)  (S  cosud  +  Re{N*  }) 
p  q  Pq  pq 

-Im{NpmNnq}  (-S  sinwd  +  Im{N*  })) 

pq  pq 

and 

<  Im{G(m,n)}>  *  l  [  (Re{fJprajjnq} ( -Ssinud  +  Ira{N*  }) 

p  q  pq  pq 

+  Im{NpmNDq)(S  coswd  +  Re{N*  })) 

pq  pq 

Use  of  these  elements  in  the  FIM  is  restricted  for  inversion  to  the 
CRMB  to  M-l  independent  delays. 


(24) 


(25a) 


(25b) 


We  now  investigate  for  the  M-l  changes  in  earlier  results 
caused  by  consideration  of  noise  which  is  spatially  correlated. 


V.  Maximum  Likelihood  Estimation 


of  the  M-l  Delays  d^-d^ 


We  again  maximize  A  by  maximizing 


A  =  -  £  in  g  +  £  XTN_1V*VTN_1X*/g 
B+  B+ 


Writing 


T  -1 
X  N 


(£  XpNpl,  £  XpNp2,  ....  £  XpNPm) 


and  the  m,n —  element 


(V*vT)  =  v  -  e  ju)(dm"dn) 
m,n  mn 


gives 


XTN"1V*VTN"LX*/g 


*  1 i  *»”v“  +  i  *  **<m 

8  m  a  1 

+  X  *  Nntl  £  X  Npm 
n  P 


+  £  £  XX*  NpmNqn). 

p^m  q^n  P  q 


In  this  form  it  may  be  seen  that  (26)  differs  from  the  spatially  uncorrelated 
noise  case  only  in  the  -£  in  g  term  and  the  terms  in  parentheses  in  (28) 


other  than  X^*  N^V".  If  p  »  0,  In  g  is  not  a  function  of  the  delays 


and  all  N  ,  p  #  q,  are  zero.  Then  as  the  literature  cited  shows  [3,4], 

maximization  of  (6)  reduces  to  either  a  beamformer  (choosing  M-l  d  )  or  a 

3  i 

system  of  M(M-l)/2  correlators  (choosing  d^d^). 


Maximizing  A  means  solving  for  d^  in  dh^/dd^  ■  0. 


msceuxx&eMi ima 


Using 


gives 


g.  -  2w  7  (-Re{Nip}sina)(d.  -  d  )  +  Im{Nip}cosw(d,  -  d  ))  (29) 

i  3d.,  ip  ip 


t—  *~y  —  y  (Re{N*P}sinu(d ,-d  )  +  Im{Ni'P}cosa>(d ,-d  )) 

n  .  g  -J.J  1  P  IP 


i  B+  ®  p^i 


r  T  -1  -1 

+  I  X  N  AN  AX* 

B+ 

where  A  »  (a  (i))  and 

mtl  /  ju(d  -d  )  .  ,  . 

,  (  -g.e  m  n  /g) ;  m,n  f  i  or  a  *  n  »  i 

mn  g  \  /.  „  >s  jti)(d  -d  )  ,  . 

)  (jw  “  g^g)  eJ  i  n  ;  m  *  i,  n  f  i 


|^(-ju  -  g^g)  e  ’^di  dm^  ;  m  +  i,  n  *  i 


Because  the  a  (i)  are  functions  of  g.  and  g,  and  g_,  and  g  are  functions  of 
mn  i  i 

all  delay  differences  d^-d^,  the  solution  for  d^  cannot  be  found  in  terms 

of  X.  and  X,  alone  nor  even  as  a  linear  combination  of  the  X  X  *  e^w^dp_dq^ 
l  1  P  q  v  H 

correlators. 

VI.  The  Cramer-Rao  Matrix  Bound  for  the  M-l  Delays  d^d^. 

It  is  well  known  that  ML  estimators  approach  the  Cramer-Rao  bound 

/v 

(CRMB) .  The  variance  bounds  for  the  delay  estimates  d^  are  the  diagonal 
elements  in 

CRMB  -  (FIM)-1  =*  (-<  grad  (grad  A  j) T>) -1 ,  ( 


wherein  FIM  is  the  Fisher  Information  Matrix,  grad  A  is  a  row  vector  whose 

t  h  *  k  l 

m —  element  is  the  derivative  of  A  with  respect  to  the  delay  d  , 

1  nH*l 

is  the  expression  in  (14),  and  <•>  denotes  expectation.  The  outer  gradient 

9  5  \ 

operator  creates  a  matrix  whose  elements  are  -  •  The  inner  partial 

•  *  k  *  y 

has  already  been  given  by  (30)  and  (31).  Continuing  with  -  g,  and 

3d,  k 
k 

3g 

glk  -  -  2u2(Re{Nik}  cos  w^-d^  +  Im{Nik}  sin  <*>(dt-dk))  (33) 
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where  w  have  the  following  values, 
mn  0 
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Using  the  above  results  gives 
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+  2  V  y  [Re{w  }  <  RetG(m,n)}  > 
m  n>m  mn 
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(39) 


To  compare  with  previous  results  observe  in  (36)  and  (37)  that  if  noise 
is  spatially  uncorrelated,  g  -  0,  and  only  u  -  u*  *  u2  eI:}'A,(di‘dk)  and 

IK  rCl. 

*  2  ±1uj(d  -d  ) 

win  *  Wni  “  “w  e  1  n  are  non-zero.  Further,  in  (25)  p  -  m  and 

q  *  n  give  the  only  non-zero  terms.  Utilizing  the  above, 
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ZOj  ii  kk 
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where  g  *  |  +  J  N11, 


and  similarly 

(FIM) 


ii 


.  V  -  ^  MiiNnn  S, 
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B+  n#i 


It  is  readily  seen  that  this  FIM  is  identical  to  Eq.  7  (the  same  as  Eq.  12 
in  (31). 

Unfortunately  the  FIM  defined  by  (36)  and  (39)  has  elements  which  are 
in  general  functions  of  the  delays  themselves,  making  analysis  difficult. 
However,  in  the  next  section  we  will  assume  a  signal  source  at  infinity, 
allowing  some  simplification. 
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VII.  Evaluacion  of  Che  CRMB 


It  is  unreasonable  to  evaluate  and  invert  the  FIM  in  sections  IV  or 
VI  in  general  because  it  is  a  function  of  all  However,  if  wavefront 

curvature  is  ignored,  each  delay  may  be  written  di  *  iA  where  A  is  the 
delay  between  adjacent  sensors.  We  may  also  let  A  vary  between  zero  and 
ojA  *  tt  for  a  single  frequency.  Then  d^  -  d^  ■  (p-q)A  for  example.  This 
is  the  beamformer  case. 

Because  of  the  generality  of  the  formulas  we  may  also  vary  the  elements 
of  N,  using  the  symmetric  matrix  (as  in  [11]) 


Z1  pi 


3  2  Mm-1 


N  -  N, 


31  •“  pm-2 


1  ... 


wherein  *  PQ  e  rae^(i,^-  Such  a  correlation  is  appropriate  for  turbulent 
boundary  layer  noises  and  its  magnitude  with  respect  to  the  unity  diagonal 
accounts  for  isotropic  noise.  In  the  following  simulations  we  choose  |pjJ* 
0,  0.2,  0.4  and  wB  having  values  0  through  it/ 2. 

Figures  2-10  show  the  CRMB  (1,1)  element,  center  element,  or  last 
element  as  a  function  of  the  various  parameters.  Table  I  is  presented  as 
a  guide  to  comparisons. 

The  formulas  for  the  FIM  may  be  applied  to  arrays  with  clustered 
elements  as  well,  if  the  spacing  between  clusters  is  considered.  We  have 
done  this  in  producing  the  data  in  Figures  11  and  12.  Zero  correlation 
between  clusters  is  assumed. 

The  clustered  (or  grouped)  arrays  studied  are  as  shown  in  Figure  1. 


3  sensors 
•*-  9  sensors 

■*-  15  sensors 


Figure  1.  Scheme  for  clustering  sensors 


The  spacing  between  array  ends  and  ends-to-center  remains  fixed.  The 
effect  of  adding  sensors  to  the  cluster  when  spatially  correlated  noise 
is  present  can  then  be  observed. 


Comments  derived  from  the  Figures  are  as  follows 

1.  Figures  2  and  5  show  that  variance  decreases  monotonically  with 
SNR  and  that  variations  in  p  from  0  through  0.4  /.45a  have 
unpredictable,  but  not  large  effects. 

2.  Figures  4  and  7  for  example  show  that  more  sensors  (from  3  to 
15)  will  reduce  the  variance  of  a  delay. 

3.  Figures  8  and  9  show  that  variance  bounds  for  delays  end-to-center 
will  vary  with  p  differently  than  those  for  end-to-end,  but  not 

a  lot.  Also  the  end-to-end  delays  vary  least. 

4.  Comparing  Figs.  1  and  3  for  example  shows  that  the  effect  of  p 

on  a  delay  estimator  will  vary  with  ooA  (look  angle.)  This  variance 
is  easier  to  see  in  Figures  4-9. 

5.  Figures  4-9  demonstrate  that  the  bounds  are  effected  by  look 
angle  to  a  much  larger  extent  when  p  is  increased  to  0.4.  As  much 
as  5dB  (Figs.  5,8)  is  observed  at  SNR  »  0.1. 

6.  Comparing  Figures  in  4-9  with  like  SNR  and  M  shows  that  different 
delays  are  effected  quite  differently  as  wA  varies;  i.e.  CRMB  (1,1), 
(2,2),  (7,7)  or  (14,14)  all  vary  differently  with  p  and  wA. 

7.  Pursuing  the  question  of  how  much  clustering  is  effective  when 
spatial  noise  is  present.  Figures  10  through  12  plot  the  variance 
bounds  vs  sensor  number  M.  Figure  12  holds  array  length  constant. 

We  conclude  that  delay  variances  are  reduced  much  less  for  M 
changing  from  9  to  15  than  for  M  charging  from  3  to  9. 

The  last  comment  is  meant  to  be  one  of  the  basic  conclusions  of  this 
study:  that  for  significant  spatially  correlated  noise,  there  is  a  point 
beyond  which  it  does  not  pay  to  increase  sensor  number  in  a  cluster  when 
the  purpose  is  to  reduce  delay  variance  between  clusters. 
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Figure  11.  CRMB (M-1,M-1)  vs  M.  P-0.2.  A— SNR-0.1,  B— SNR-l.O, 
C-* -SNR-10. 0.  M  sensors  have  constant  array  length; 
clusters  of  1,  3,  and  5  elements  at  center  and  ends 
of  array.  No  noise  correlation  between  clusters. 
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Figure  12.  CRMB(M-1,M-1)  vs  M.  SNR-1.0.  A— p-0.0,  B— P-0.2, 

C— p-0.4.  M  sensors  heve  constant  array  length; 
clusters  o£  1,  3,  and  5  elements  at  center  and  ends 
of  array.  No  noise  correlation  between  clusters. 
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CHAPTER  3 


A  ROBUST  RUNNING-WINDOW  DETECTOR  AND  ESTIMATOR  FOR 
STEP-SIGNALS  IN  CONTAMINATED  GAUSSIAN  NOISE 

Abstract 

An  N-point  window  is  applied  to  noisy  data  to  recover  stepped  signals  in 
non-Gaussian  noise.  Robust  measures  of  signal  step  level  and  noise 

distribution  spread  are  used  to  detect  sequential  clusters  of  data  points 
which  are  statistically  significantly  different,  thereby  detecting  the  step. 
Using  conventional  anal ysis-of -variance  methods,  but  with  robust  parameter 
estimates,  false  alarm  probabilities  are  set  reasonably  accurately  and  miss 
probabilities  and  signal  level  estimates  are  shown  by  simulation  to  yield  good 
results.  Applications  to  Kalman  filtering,  seismic  and  well-log  data,  and 
image  processing  are  indicated. 


I.  Introduction 


The  detection  of  steps  of  signal  level  in  noisy  data  is  a  common  problem. 
Adaptive  Kalman  filters  estimate  measurement  bias  and  unknown  step  inputs  to 
the  process  [1,2].  Image  processing  and  segmentation  often  rely  on  detecting 
regions  of  constant  level  [3].  The  robust  median  filtering  method  for  extract¬ 
ing  stepped  signals  in  the  non-Gaussian  noise  of  well-logs  has  been  found  to 
be  an  improvement  over  such  linear  methods  as  linear  prediction,  Harkov  edit¬ 
ing,  and  running  average  [4].  The  median  itself  is  a  robust  measure  of 
location  [5]  which  is  resistant  to  values  of  wildpoints  in  the  minority  of  the 
sample.  Robust  measures  of  spread  are  also  appropriate  when  the  error  distri¬ 
bution  is  now  known. 

In  the  following  we  suggest  robust  methods  for  the  detection  of  steps  in 
non-gaussian  noise.  The  noise  used  in  the  simulation  is  contaminated  Gaussian, 
which  is  typical  of  heavy-tailed  densities.  Thus  a  step-detector  of  signals 
in  such  noise  must  not  only  be  able  to  locate  the  step  but  estimate  its  level, 
while  being  resistant  to  the  extreme  values  of  the  wild  points  in  the  sample 
window.  Our  methodology  is  related  through  its  non-linear  approach  to  those 
of  median  and  order  filtering  [12-14]. 

We  show  by  simulation  that  the  suggested  robust  and  resistant  algorithm 
can  be  designed  to  give  predictable  results,  giving  stepped  output  with  excel¬ 
lent  step  location  accuracy  even  in  low  signal- to-noise  ratios. 

II.  Running  Cluster  Detection 

We  wish  to  know  if  within  the  data  window  of  length  N  there  is  a  step  in 
the  noise-free  signal.  If  there  is  a  step  we  want  to  know  its  location  and 
size.  Several  statistical  tests  are  appropriate.  Conventional  tests  on  nor¬ 
mal  data  are  compared  with  robust  methods. 


In  order  Co  determine  the  most  likely  position  of  the  step  — -  if  it 


exists  —  within  the  window,  we  cluster  the  j  points  on  the  left  in  the 
window  into  cluster  CQ  and  the  N  -  j  points  on  the  right  into  C^.  That  j  which 
maximizes  the  cluster-difference  to  spread-within-clusters  ratio  is  assigned 
the  step  location  point.  Two  robust  methods  are  proposed  for  this  purpose. 


Robust  cluster  algorithm  number  1 

For  any  j,  the  cluster  difference  S  is  defined  as  the  median  difference 
[6,  p.  133] 
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(1) 


a  -  1,  2,  ....  j  ,  6  -  1,  2,  ...  ,  N-j. 

The  within-clusters  noise  variance  will  be  estimated  with  the  robust 
formula 
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(2) 


where 

A(Ha)  -  ira  -  ya2)(l  -  5yaz)][-l  +  r (1  -  ua2)(l  -  5pa2)]  (3) 

and  A  (Mg)  is  similar;  ci  and  3  index  over  the  members  of  clusters  Cq  and 
respectively  and  indicates  only  those  members  of  the  sum  for  which  y2  £  1, 
and  where  for  example 


u  \-N-KX  “  m*d^Xk-N+a^ 

a“  9  06,11 1  VlHa  '  MdlWI} 


(4) 


Formula  (2)  combines  robust  variance  estimation  [5],  with  linear  residual 
techniques  [7].  Thus  the  ua,  Ug  provide  weights  for  the  data  within  their 
respective  clusters,  the  over  a  and  3  Include  only  those  points  in  the  sum- 


of-squares  which  are  not  too  wild,  and  the  factors  such  as  j ( j  -  1)/A(ua> 
equal  unity  for  small  and  no  wild  points,  reducing  9q2  to  a  typical  variance 
estimate  for  the  two-level,  single  factor  problem  in  analysis  of  variance.  The 
degrees  of  freedom  (d.o.f.)  for  9q2  Is  suggested  [5]  to  be  0.7  times  the  usual 
N  -  3.  The  d.o.f.  for  S2  is  taken  to  be  1,  analogous  to  the  sum-of-squares  of 
two  means  around  the  population  mean.  The  signal  to  noise  ratio  or  variance 
ratio  is  estimated  by 

SNR  -  S2/a  2  (5) 

n 

This  SNR  is  not  the  one  which  we  wish  to  maximize  over  j.  To  optimize  j,  we 
must  allow  greater  variation  within  cluster;  therefore  we  just  set  ua  »  Ug  ■  0 
for  all  a,  $,  in  the  calculation  of  9q2  .  Allowing  for  infinite  SNR  the 
algorithm  must  first  test  for  zero  in  denominator  expressions.  Further  we 
would  not  wish  to  assign  just  one  (or  maybe  even  just  two)  value  to  a  cluster. 
That  constrains  2  <  j  £  N-2. 

In  addition  we  must  test  the  hypothesis  that  S  -  0.  Given  the  above  d.o.f 
we  accept 

H0:  S-0if  F*  <  F1j0.7(11_3)<Y).  (6) 

where  we  suggest 


o 

n 


and  y  is  cumulative  area  to  the  left  of  the  F-density  (with  one  and  0. 7 (N-3) 
d.o.f.)  at  the  abscissa  Fj^  Q  For  example  if  y  -  0.50  and  N  -  11, 

F,  -  ,(0.50)  ~  0.520.  Thus  1  -  y  ■  P_  -  probability  of  false  alarm  if  the 

X  f  j  •  o  * 

data  were  purely  gausslan. 


Robust  cluster  algorithm  number  2 

This  algorithm  uses  the  conventional  ANOVA  method,  but  the  data  is  pre- 


processed  with  a  three  or  five-point  median  filter  (a  length  small  with  respect 
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to  the  window).  Typical  results  show  that  both  S  and  Qqz  have  been  reduced 

because  of  the  rounding  effect  of  the  median  at  the  step  location.  Justusson 

(in  [8])  gives  the  variance  for  such  a  filtered  normal  white  sequency  (M  *  3) 

to  be  0.4487o2  [8].  He  also  shows  the  rounding  effect  at  the  step  location. 

The  d.o.f.  measure  is  often  taken  to  be  2TB  where  2B  is  the  two-sided 

equivalent  signal  bandwidth  and  X  is  the  data  record  length  [9,10].  We  can 

determine  2B  from  a  known  discrete  auto-correlation  function  R~(t)  of  the 

x 

I 

median-filtered  sequence  x^,  assuming  A  seconds  between  samples.  The  equiva¬ 
lent  band-width  B  is 
00 

2B  -  /  S~(f)df/S~(0) 

—00 

CO 

-  R~ (0)//  R~ (x)  (8) 

x  -00  X 

Discretizing  these  measures  the  d.o.f.  in  measuring  d^2  with  N  samples  is 

M-l 

v  -  N  R~ (0)/  l  R~(k)  (9) 

k— M+l 

From  Justusson  [8]  we  have  for  M  ■  3,  Rx(0)  "  0.4487,  Rx(l)  *  0.2480,  and 
R^(2)  -  0.1177.  Because  we  remove  means  from  the  and  squares,  the 
overall  d.o.f.  for  N  points  over  the  two  clusters  is  vg  -  2.  This  gives  the 
clusters  respectively 


vn  -  j  R-(0)/I  R~ (k) 


(10a) 


vx  -  (N  -  j)Rj(0)/I  R~(k) 


(10b) 


d.o.f.,  and  the  variate  estimate  in  the  general  case  is 


V~T2  I(v0  ‘  l)  50Z  +  (V1  '  1}  V1 


where  o^2  are  variances  for  Cq  and  calculated  with  -  1  degrees  of  freedom. 


He  suggest  that  using  standard  ANOVA  applied  to  pre-median  filtered  data 

with  3  2  calculated  as  in  (10)  and  (11)  is  appropriate  for  detecting  and  esti- 
n 

mating  a  step.  However,  because  of  the  effect  that  pre-median  filtering  has 
on  the  data  near  the  step  location,  the  signal  estimate  is  smaller  than  with 
other  methods.  This  may  lead  to  erroneous  acceptance  of  H^:  S  *  0  for  small 
step-to-noise  ratios,  where  other  methods  would  not  err.  In  addition  the  d.o.f. 
estimate  per  Equations  (9-11)  is  only  useful  to  the  extent  that  R~  is  known. 

He  will  see  in  our  results  that  these  formulas  gave  P^,  lower  (and  thus  ■ 
probability  of  miss  higher)  than  designed. 

III.  Simulations  for  F-test  Error  Probabilities  for  Steps 
in  Gaussian  and  Non-Gaussian  Noise 

Data  has  been  produced  to  allow  evaluation  of  both  robust  clustering 
algorithms.  The  programs  yield  maximum  SNR  clustering  within  the  running 
data  window.  However,  for  determining  merit  factor  (defined  later)  and  proba¬ 
bilities  of  miss  and  false  alarm  the  window  samples  are  independent.  The 
number  of  window- length  data  sets  is  375.  Hhen  steps  are  present  they  occur 
3(N-l)/2  +  1  points  apart. 

As  discussed  following  Equation  5,  to  optimize  the  clustering  SNR  a  more 
conventional  SNR  is  used;  this  is  implemented  by  setting  the  data  weights  equal 
to  unity.  He  still  have  a  robust  SNR  measure  in  that  cluster  medians  rather 
than  means  are  used  in  estimating  the  steps.  The  weights  are  retained 
(Equations  3-4)  for  SNR  which  is  used  in  the  F-statistic,  Equation  (7),  which 
is  to.be  compared  to  F^  Q  7(N_3)CY)*  F*  F*  then  no  steP  is  declared; 
otherwise  a  step  is  declared.  The  decision  creates  possibilities  for  both 
miss  and  false  alarm,  which  probabilities  are  estimated  by  the  simulations. 

The  type  of  contaminated  Gaussian  noise  considered  is  such  that  the  density 
of  x(k)  (with  no  signal  step)  is 


Thus  Che  contamination  is  also  Gaussian,  but  x  has  heavier  tails  when  0^  >  1* 

In  addition  a  merit  factor  calculation  is  evaluated.  This  figure  of 
merit  is  discussed  by  Pratt  [3]  with  regard  to  detection  of  edge  pixels  in 
images.  The  formula  is 

i  Ia  i 

f°»-  j;  •  <13> 

where  1^  ■  max(I^,  1^)  and  1^  and  1^  are  respectively  true  number  and  actual 

detected  number  of  steps  in  the  test  data,  a  is  a  scaling  constant,  and  d  is 

the  location  error  distance  count  between  a  true  step  and  one  detected.  We 

have  used  a  *  1/9;  for  this  value  fom  becomes  0.9  if  for  example  all  steps  are 

detected  but  mislocated  by  one  point.  If  on  the  other  hand  there  were  no  false 

alarms,  and  all  detections  were  accurately  located,  but  20Z  were  missed,  then 

fom  ■  0.8.  At  the  other  extreme,  if  fraction  p  of  the  detections  are  false 

alarms  (I„  ■  I_),  and  those  p  are  all  d  counts  away  from  a  true  step,  then 

N  1— p  I 


1-tod  (1-p) 

fon-  ~iW 


(this  only  applies  when  there  are  true  steps  (SNR  >  0)  in  the 


data).  It  can  be  seen  that 


(1  +  ad2)"1,  p  -  1 


1  -  P 


,  d  -*■  00 


Note  that  in  our  figuring,  a  step  detected  with  d  >  2  is  considered  both 
a  false  alarm  and  a  miss,  but  that  d  £  2  constitutes  a  valid  detection.  Now 
consider  the  results  of  Figures  1-3  wherein  e  ■  0  (uncontaminated  Gaussian). 

Figure  1  shows  the  probability  of  miss  vs.  SNR  for  N  ■  15.  The  curves 
within  the  Figure  are  for  y  (the  variable  in  F(y))  equal  to  0.3,  0.9,  and  0.99. 
These  curves  show  that  at  SNR  0,  Pr(miss)  -  y,  which  is  to  be  expected  be¬ 
cause  as  S  -*•  0  with  fixed  noise  the  density  of  the  data  is  essentially  that  of 
the  noise.  At  higher  SNR's  Pr(miss)  decreases  for  fixed  y  and  increasing  N. 


Figure  2  tends  to  verify  the  use  of  the  d.o.f.  prescribed  to  Oqz  in  the 
F-value  [5],  i.e.,  0. 7 (N-3) •  However,  some  inaccuracy  is  noted.  The  curves 
should  be  level  at  the  value  1  -  y;  that  is,  Pr (false)  -  1  -  y.  The  curves 
are  generally  too  low,  even  though  they  are  somewhat  better  for  larger  N. 

For  small  N,  such  as  N  -  5,  undoubtedly  the  variance  estimates  are  pessimis¬ 
tically  too  large,  reducing  the  Pr (false).  But  for  N  larger  (9,  15))  and  y 
large  (.9,  .99),  the  d.o.f.  approximation  seems  to  be  fair. 

Figure  3  is  the  figure-of-merit  curve  for  N  *  15.  We  found  that  the 
trend  is  toward  higher  merit  for  N  larger  (as  reasonable) .  Also  as  N  increases 
there  is  a  lower  SNR  at  which  point  higher  y  are  more  meritorious  than  lower  y. 
We  realize  that  as  SNR  -+  0,  y  »  0.5  will  maximize  fom  and  minimize  probability 
of  either  error  type,  but  at  some  higher  SNR  maximation  is  via  larger  y.  Sim¬ 
ilar  figures  may  be  found  in  [11]  for  N  *  5  and  9. 

We  now  discuss  and  compare  results  when  noise  is  contaminated  Gaussian. 

Specifically  the  noise  is  e  *  0.05  contaminated  at  scale  7.14  2  *  51.0  ) 

such  that  a  2  ■  3.5. 
x 

Figure  4  is  comparable  to  Figure  1.  With  contamination  the  probability 
of  miss  at  any  SNR  is  greater.  The  contamination  undoubtedly  raises  the  noise 
variance  estimates,  lowering  F*  values  below  the  decision  threshold,  and  causes 
the  miss.  It  has  been  shown  in  [11]  that  conventional  noise  variance  estima¬ 
tors  will  be  much  larger  than  the  robust  estimators,  decreasing  F*  even  more 
and  increasing  the  probability  of  miss. 

Curves  for  false  alarm  and  fom  have  also  been  run  with  noise  contamina¬ 
tion,  Very  little  difference  is  noted.  Again  this  is  likely  due  to  the  fact 
that  the  robust  estimators  make  it  difficult  to  let  wild  points  alter  the  sig¬ 
nal  estimate  while  at  the  same  time  increasing  somewhat  the  noise  estimate. 

Thus  the  trend  is  to  slightly  reduce,  if  anything,  the  false  alarm  rate  as 
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Figure  3 


Figure  of  Merit  vs.  SNR,  N  ■  15,  no  contamination,  algorithm 
number  1. 
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contamination  Increases.  This  Is  perhaps  a  beneficial  result  of  using  the 
robust  estimators  of  cluster  centers  and  noise  variances.  Figure  of  merit 
does  slightly  decrease  at  larger  SNR,  due  to  increased  miss  rate. 

We  next  tested  robust  algorithm  number  2  with  a  pre-median  filter  length 
M  ■  3.  Results  are  shown  in  Figures  5-7  for  no  contamination  and  in  Figure  8 
for  the  same  contamination  as  earlier. 

Comparing  Figures  1  and  5,  probability  of  miss  with  e  «  0  are  higher  for 
algorithm  number  2  than  those  of  algorithm  number  1.  An  explanation  is  that 
pre-median  filtering  considerably  smoothes  to  the  middle  of  the  step  edge,  re¬ 
ducing  the  measured  spread  between  clusters. 

Figure  6  for  probability  of  false  alarm,  e  ■  0,  N  -  15,  algorithm  number 
2,  should  be  compared  to  Figure  2  of  algorithm  number  1.  It  may  be  seen  that 
false  alarm  rate  has  dropped  considerably  due  to  the  pruning  of  wild  points 
by  the  pre-median  filtering.  The  cost  however  was  the  increased  probability 
of  miss,  just  discussed.  An  inappropriate  estimate  of  d.o.f.  is  another  ex¬ 
planation,  causing  the  threshold  to  be  too  high. 

IV.  Simulations  of  Signal  Estimation 
In  order  to  better  visualize  the  results  of  using  the  running  window 
clustering  algorithm,  example  runs  with  different  SNR  and  F-threshold  were 
simulated.  The  noise  process  in  each  of  these  runs  is  the  same.  Its  density 


pn(n)  -  0.95  N(0,1)  +  0.05  N(0,51) 

and  a  2  ■  3.5.  Signal  steps  occur  at  sample  numbers  22,  44,  88,  ...  .  Window 
n 

size  is  held  fixed  at  N  «  15,  and  SNR  ■  S2/aQ2,  where  S  *  true  step  size  between 
clusters. 


S-'.V*  *4  *  j 


\  **  v', 


Figure  5 


PR(MISS)  N=15.GflMMR  PflRfiM. 


.  Probability  of  miss,  no  contamination,  algorithm  number  2, 
N  -  15. 
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Two  step-size  estimators  are  used.  Method  1  is  as  we  have  described 
before;  the  new  level  assigned  to  the  output  is  the  median  of  that  cluster. 

Cl,  which  contains  the  newest  data.  Method  2  does  not  assign  an  output  level 
until  the  next  step  is  detected;  then  it  assigns  the  median  of  all  samples 
between  step  locations.  Nevertheless,  step  detection  with  method  2  uses  only 
data  within  the  window.  This  is  necessary  if  the  F-value  is  to  remain  fixed. 
Method  1  is  necessary  for  real  time,  finite-delay  decisions  and  signal  level 
estimation  as  we  find  in  Kalman  filtering. 

Figures  9-11  are  for  method  1  with  SNR  ■  4,  2,  and  1  respectively,  and 
Figures  21-23  similarly  show  results  of  method  2.  Method  2  should  be  better 
due  to  the  larger  number  of  samples  possible  between  step  detections  than 
that  in  cluster  (0  within  the  window).  Figures  9-11 (a)  show  the  result  for 
y  -  0.5  and  Figures  9-11 (b)  show  the  result  for  Y  ■  0.95,  or,  respectively, 
false  alarm  thresholds  0.5  and  0.05.  On  the  left  is  the  input  data  and  on 
the  right  are  true  and  estimated  signal  plots.  Figures  12-14  are  for  Y  *  0.95, 
Method  2. 

These  figures  verify  expectations.  Fewer  steps  are  allowed  with  Y  ■  0.95, 
the  step  locator  is  surprizingly  accurate  even  in  considerable  noise,  and 
estimated  step  value  is  reasonably  good  and  immune  to  wild  points. 

Comparing  Figures  9-11 (a)  with  Figures  12-14,  it  is  not  clear  that  method 
2  is  advantageous.  More  analytical  work  or  simulation  needs  to  be  done. 

V.  Conclusions 

This  work  has  introduced  robust  statistical  methods  as  a  solution  to 
the  problem  of  detecting  stepped  signals  in  non-Gausslan  noise.  The  method 
uses  F-statistics  and  standard  analysis-of-variance  techniques  for  detecting 
steps,  but  the  cluster  means  are  medians  and  the  noise  variance  estimates  use 
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MIST  SIGNAL  fUPRBCC.  .B=TRUE 


Figure  9-b.  Method  1,  true  (B)  and  estimated  (A)  signal  in  noise  from  data 
(left).  SNR  -  4,  Pp  -  0.05,  N  -  15. 
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Figure  11-b.  Method  1,  true  (B)  and  estimated  (A)  signal  in  noise  from  data 
(left).  SNR  -  1,  P  -  0.05,  N  -  15. 


a  biweight  method  for  robustness.  It  is  shown  that  probability  of  false 
alarm  can  be  closely  predicted  and  that  step  locations  and  values  are  closely 
estimated.  The  algorithm  in  one  of  its  variations  should  find  uses  in  any 
field  where  constant  valued  data  is  to  be  located  in  heavy-tailed  noise. 
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CHAPTER  1» 


ROBUST  ADAPTIVE  KALMAN  FILTERING  FOR 
SYSTEMS  WITH  UNKNOWN  STEP  INPUTS  AND 
NON-GAUSSIAN  MEASUREMENT  ERRORS 

Abstract 

Target  tracking  with  Kalman  filters  is  hampered  by  target  maneuvering  and 
unknown  process  and  measurement  noises.  We  show  that  moving  data  windows  may 
be  used  to  analyze  state  and  measurement  error  sequences,  determining  robust 
estimates  of  bias  and  covariance.  For  steps  in  the  system  forcing  functions 
and  non-Gaussian  measurement  errors  the  robust  estimators  yield  improvements 
over  linear  bias  and  covariance  estimators.  Extensive  simulations  compare 
conventional,  linear  adaptive  and  robust  adaptive  average  step  responses  of  a 
first-order  system  filter.  Quantities  examined  are  state  estimate,  state 
error,  process  and  measurement  covariance  estimates,  Kalman  gain  and  input 


step  estimate. 


I.  INTRODUCTION 


The  tracking  of  targets  Is  a  difficult  problem  in  practical  situations 
Measurements  of  target  location  are  usually  derived  from  time  delay  esti¬ 
mates,  which  is  a  subject  in  itself,  and  a  large  body  of  literature  deal¬ 
ing  with  that  alone  has  been  compiled  in  recent  years  and  is  typified  in  a 
recent  special  issue  [1].  The  translation  of  time  delays  to  target  loca¬ 
tion  and  motion  parameters  is  via  nonlinear  functions,  and  error  variances 
on  the  delays  result  in  confidence  Intervals  on  the  location  parameters 
[2,  3].  Hassab  and  Boucher' have  shown  experimentally,  however,  that  delay 
estimates  are  not  Gaussian  [4].  In  many  other  filtering  problems  the 
measurements  are  also  non-Gauss ian.  Several  Kalman  tracking  schemes  are 
examined  by  Hassab  et  al.  [S],  wherein  the  plant  model  is  linear  in  the 
states  while  the  measurement  relationship  is  nonlinear  in  the  states. 

Previous  efforts  of  applying  Kalman  filter  methodology  to  tracking 
have  been  very  useful,  yet  the  practical  problems  of  non-Gaussian  noise 
and  maneuvering  targets  have  either  been  ignored  or  only  lightly  touched 
upon  with  the  linearized  approaches  until  recently.  It  may  be  reasonable 
to  model  target  system  process  noise  as  Gaussian,  but  it  is  probably  not 
reasonable  to  model  either  the  deterministic  (but  unknown)  system  forcing 
function  or  the  time-delay  measurement  noise  as  Gaussian.  Recent  papers 
by  Bradley  and  Kirlin  [6]  and  Weiss  and  Weinstein  [7],  as  well  as  other 
previous  papers  giving  simulations  of  time  delay  estimation  algorithms, 
have  pointed  out  the  need  for  editing  the  delay  measurements,  particularly 
in  low  SNR  or  narrow  band  signal  situations. 

Forcing  functions  for  the  target  motion  system  may  be  taken  into 

account  by  the  1 «■*«  tracker  by  increasing  the  variance  of  the  process 

noise,  preferably  adaptively.  An  example  of  a  similar  scheme  is  given  by 


Godiwala  Id],  Derain  a  bank  o£  Kalman  filters,  each  assuming  a  different 
mean  process  input,  are  adaptively  weighted.  A  number  of  target  maneuvers 
are  simulated  to  test  the  algorithm.  Results  are  good,  although  some  limi¬ 
tations  are  discussed,  and  not  too  drastic  velocity  changes,  termed  semi- 
Markov  in  the  simulation,  have  a  0.01  probability  of  occurring. 

A  more  difficult  and  typical  tracking  problem  has  been  addressed  by 
Groutage  [  9,  10].  Based  on  work  by  Myers  and  Tapley  [11],  Groutage  uses 
maan  and  covariance  estimators  from  moving  windows  on  the  state  and  measure¬ 
ment  error  sequences  to  adapt  to  non-zero,  deterministic  acceleration  in¬ 
puts.  He  also  employs  a  non-op timal  Kalman  gain  modification  and  robust 
post-smoothing  of  the  state  estimates.  These  methods  yield  improved  results 
over  the  linear  adaptive  methods  in  [11]. 

The  non-Gauasian  noise  problem  is  also  addressed  by  Boncelet  and 
Dickinson  [12],  who  suggest  formulating  the  Kalman  filter  as  a  regression 
problem  and  performing  robust  regression.  They  show  no  results,  however. 
Another  approach  without  results  is  discussed  by  Agee  and  Turner  (in  [13]). 
Masreliez  and  Martin  [14]  develop  an  optimal  Kalman  gain  to  edit  wild 
measurements  with  a  specific  non-Gaussian  noise,  but  they  do  not  consider 
step  inputs.  Meyr  and  Spies  [IS]  use  probability  of  outliers  to  edit  wild 
points  in  the  Kalman  filter,  but  again  do  not  adaptively  deal  with  step  in¬ 
puts. 

We  propose  tracking  methods  similar  in  kind  to  those  of  Groutage  and 
Myers  and  Tapley,  but  adaptive  robust  estimators  of  location  and  spread  are 
used  in  place  of  their  linear  estimators  of  mean  and  variance.  The  location 
estimator  chosen  is  the  median,  which  is  known  to  precisely  track  noise-free 
step  functions  without  the  smearing  effect  inherent  with  running  sample 
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means  [16-17].  The  robust  covariance  measure  uses  the  biweight  proposed 
by  Tukey  [18]  and  has  good  promise  for  dealing  with  the  non-Gaussian  delay 
estimates. 


II.  CONVENTIONAL  AND  ADAPTIVE  KALMAN  FILTERS 
The  system  state  equation  is 

x(k  +  1)  -  Ax(k)  +  u(1c)  +  w(k),  (1) 

where  x  is  the  state  vector,  A  is  a  matrix,  u(k)  is  an  acceleration  vec¬ 
tor,  and  w(k)  is  a  vector  of  white  noises  which  account  for  either  or  both 
source  and  sensor  array  random  motions. 

The  measurement  vector  is 

z(k)  -  Hx(k)  +  v(k),  (2) 

where  v  is  an  additive  error  vector,  and  H  is  a  matrix. 

The  conventional  estimator  and  error  covariance  update  equations  are 


[19] 

x(k)  -  Ax(k-l)  +  K(k) (z(k)  -  HAx(k-l))  (3) 

P(k)  -  (I  -  K(k)H) (AP(k-l)AT  +  Q(k-l))  (4) 

K(k)  -  (AP(k-l)AT  +  Q(k-1))HT 

[H(AP(k-l)AT  +  Q(k-1))HT  +  R(k)]"1  (5) 

wherein 

P(k)  -  cov(x(k))  (6) 

Q(k)  -  cov(w(k) )  (7) 

R(k)  -  cov(v(k) )  (8) 


Initial  conditions  on  x,  P,  Q,  and  R  must  of  course  be  specified. 
Quite  often  P(0)  is  set  to  Q(l),  Q(l)  and  R(l)  are  known,  and  x(0)  may 
be  set  to  (0). 
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In  the  adaptive  systems  similar  to  that  of  Groutage  [10],  N  most 

s 

recent  samples  of  the  state  error  sequence 

f(k)  -  x(k)  -  Ax(k-l)  (9) 

are  used  to  estimate  an  unknown  constant  forcing  function  or  bias.  That  is, 
let  u(k-l)  be  an  adaptive  state  bias  estimator. 


u(k-l)  -  —  Z  f(k-j), 
Ng  j-1 


This  bias  is  then  included  in  the  state  estimator: 

x(k)  -  Ax(k-l)  +  u(k-l)  +  K(k) (z(k)  -  H(Ax(k-l)  +  u(k-l)) 

-r  (k)  )  .  (11) 

Similarly  in  (11),  r(k)  is  an  adaptive  measurement  bias  estimator  based  on 
the  most  recent  errors  in  the  predicted  measurements: 

*00  -  l  [z(k-j+l)  -  H(Ax(k-j)  +  u(k-j))]  (12) 

Z  j-1 

Myers  and  Tapley  [11]  show  that  for  H  linear  the  bias  measures  (10)  and  (12) 
are  unbiased.  Groutage  [10]  also  shows  that  for  linear  H  that  the  covariance 


estimator 


where 


QOO  -  JTT  l  [f'(k-j+l)  -  a'(k)’)(f'(k-j+l)  -  u'(k))1] 

3  Ns 

-  l  (AP (k-j )AT  -  P(k-j+l)) 

s  j-1 


f'(k)  -  x (k)  -  Ax(k-l)  -  u(k-l) 


u"(k)  -  l  r(k-j+l) 

s  j-1 


r  't  : ,  f  .  •> Vv  *  . 


i  ’  "/V  v X 


is  unbiased.  Similarly  the  measurement  covariance  estimate, 

NZ 

R(k)  -  Z  {(y(lc-j+l)  -  r(k))(y(k-j+l)  -  r(k))T 

Z  j-1 

N  -1 

-  HP(k-j+l)HT>,  (16) 

NZ 

where 

P(k)  -  AP(k-l)AT  +  Q(k-l)  ,  (17) 

P  is  an  estimate  of  P  using  Q  in  (4),  and  y(k)  »  z(k)  -  H(Ax(k-l)  +  u(k-l)) 
is  shown  to  be  unbiased.  Also  in  [11]  are  recursive  formulations  for  u, 

A  A  A 

Q,  r,  and  R. 

Although  r  may  be  estimated  using  (12) ,  and  it  is  even  necessary  in 
estimating  R,  we  show  in  Appendix  A  that  both  u  and  r  cannot  be  simul¬ 
taneously  estimated  as  indicated. 

Groutage  has  proposed  modifying  the  gain  K(k)  in  (5)  in  such  a  way 

A  /V 

that  when  R  -*■  0,  the  modified  gain  is  the  same  as  (5),  but  as  R  “,  the 
modified  gain  changes  exponentially  rather  than  inversely  toward  zero. 

His  example,  using  noise  plus  a  square  pulse  of  acceleration  driving  a 
double  integrator  and  knowing  only  measurements  of  position,  shows  good 
improvement  over  both  the  conventional  and  previous  adaptive  techniques. 
Another  Groutage  innovation  uses  the  biweight  [18]  to  smooth  resulting 
estimates  of  states  and  acceleration  input  as  computed  from  the  derivative 
of  the  velocity  estimate. 

The  foregoing  ideas  have  encouraged  further  nonlinear  methods  for 
dealing  with  unknown  deterministic  inputs  and  non-Gausaian  noise.  We 
describe  these  in  the  following  section. 


III.  ROBUST  METHODS  FOR  NON-GAUSSIAN  NOISES  AND  FORCING  FUNCTIONS 
The  processings  discussed  In  section  II  are  the  basis  for  many  track¬ 
ing  schemes.  The  theoretical  developments  have  evolved  from  assumptions 
of  linear  or  linearized  state  and  measurement  equations  with  known  deter¬ 
ministic  plus  known  Gaussian  inputs  to  considerations  of  and  allowance  for 
nonlinear  equations,  unknown  non-Gaussian  noises  and  unknown  inputs. 

Exactly  at  which  steps  or  to  which  data  the  robust,  adaptive  techniques 
should  be  applied  has  not  been  totally  answered,  and  indeed  maybe  no  one 
process  will  be  found  that  answers  a'x  problems.  Yet  it  is  clear  that  in 
the  face  of  all  the  unknowns,  adaptive  methods  and  robust  data  handlers 
offer  improvements. 

The  particular  problem  addressed  here  is  the  state  system  of  Equation 
1  with  unknown  step  input  u(k)  and  measurements  as  in  Equation  2.  The 
state  estimator  and  its  covariance  matrix  are  as  in  Equations  11  and  4 
respectively,  but  estimates  of  fi,  §,  and  R  and  to  be  obtained  as  follows. 
Thus  we  have  the  estimate: 

x(k)  *  Ax(k-l)  +  u(k-l)  +  K(k)c(k) [z(k)  -  H(Ax(k-l)  +  u(k-l) ) 


-  r(k)]  (18) 

the  measurement: 

z(k)  »  H(x(k) )  +  r (k)  +  v(k)  (19) 

the  error  covariance  estimate: 

P(k)  -  (I-K(k)H)(AP  (k-l)AT  +  Q(k-l))  (20) 

the  robust  process  covariance  estimate: 

QOO  -  (qi;j)  (21) 

the  robust  measurement  error  covariance  estimate: 


the  robust  forcing  function  estimate  (med  ”  median)  based  on  the  N  most 
recent  estimation  errors: 

u(k)  *  med  (x(k)  -  Ax(k-l) (23) 

s 

the  robust  measurement-bias  estimate  based  on  the  most  recent  measurement- 
prediction  errors: 

r(k)  »  med  (z(k)  -H(Ax(k-l)  +  u(k-l))}„  (24) 

Z 

and  non-op timal  Kalman  gain: 

K(k)  -  (AP (k-l)AT  +  Q(k-l))HT(k)[H(AP(k-l)AT  +  Q(k-1))HT 

+  R(k)]~l  (25) 

and  gain  modifier  c(k)  to  be  formulated  in  the  following. 

The  elements  qim  are  found  by  one  of  the  procedures  suggested  in 
Mosteller  and  Tukey  [16].  For  i  -  m. 


Ng  £'(f'(k-j+l)  -  «ed{f>(J)})*(l-Mlj2)4 
(1"Mij)(l-5Mij)H'l  + 


(26) 


where  (1-M2  )2  are  the  biweights  for  the  j—  sample  of  the  i—  state 


variable,  and 


M 


ij 


f"(k-j+l)  -  med(f"(J)} 


6  med{ j  f^(m)  -  med{f "(m) } |  j } 


k-N  +1  <  m  <  k 
s 


(27) 


and  Z'  Indicates  summation  only  over  those  j  for  which  M2^  <  1.  Thus  the 
biweight  weights  less  those  variations  or  the  state  estimate  which  are  more 
distant  from  the  median  and  gives  weight  zero  those  variations  which  are 
greater  than  6  times  the  median  variation.  It  can  be  seen  that  for  small 


Gaussianly  distributed  variations  around  the  true  median,  reduces  to 


an  unbiased  variance  estimator. 


The  covariance  estimators  a.  may  be  found  using  a  number  of  robust 

lm 

methods  -  Spearings  rank  correlation  has  been  shown  to  be  effective,  giving 


si»  -  • 


(28) 


where  p,  is  the  correlation  estimate, 
im 

The  r<m  are  found  in  the  same  manner  as  the  For  i  *  m. 


im 

2s<* 


Nz  r(yi(k-j+l)  -  r^k^Cl-n^2) 

11  "  I V  (i-nij2)(i-5n112)]t-i+r(i-ni12)(i-5n112)] 


'ij 


ij 


'ij 


(29) 


where 


and 


yi(k-j+l)  -  '?i(k> 


‘ij 


6  med{  ly^m)  -  r^k)  I  ' 


k-N  +1  <  m  <  k 
o 


(30) 


y±(k)  -  zi(k)  -  H(Ax(k-l)  +  u(k-l)) 


(31) 


With  the  we  propose  a  biweight  gain  modifier,  c(k),  a  diagonal  matrix 


with  diagonal  elements 


ci±(k) 


fa- 


n2  )2 

il' 


nil*  1 
nil  >  1 


(32) 


Thus  for  nominal  estimated  measurement  errors  y^(k)  with  respect  to  the 
running  error  median  r^k),  the  weight  (1-n^)2  is  approximately  unity 

A 

and  cii(k)  has  no  effect  on  the  gain  K(k).  A  wild-point  measurement  z^, 
however,  causes  to  be  large  and  c^Ck)  -  0,  resulting  in  the  x^k) 
becoming  Ax(k-l)  +  u(k-l),  simply  the  prediction. 

The  formulas  (18;-C32)  constitute  a  robust  and  resistant  Kalman  filter 
particularly  suited  for  systems  having  heavy-tailed  and  biased  non-Gaussian 
measurement  noise  and  non-deterministic  step  forcing  functions.  The  esti- 

A  A 

mators  Q  and  R  and  gain  modifier  c(k)  will  deal  with  wild  points  and  heavy¬ 
tailed  or  mixed-density  noise.  The  estimate  u  will  track  steps  in  unknown 


forcing  functions;  preserving  the  sharpness  of  the  step  yet  excluding 
(Ng-l)/2  or  fewer  consecutive,  clustered  wild  points  from  affecting  the 
estimation.  Example  theoretical  effects  comparing  linear  and  robust  estima¬ 
tors  near  steps  are  given  in  Appendix  B. 


IV.  SIMULATION  RESULTS  OF  FIRST-ORDER  SYSTEM  KALMAN  FILTERING 
The  findings  of  previous  chapters  have  been  integrated  into  a  Kalman 
filtering  problem  with  a  single  state.  The  state  and  measurement  equations 
are 

•w  ’  **k  +  \  +  “k. 

\  m\  +  \ 

For  all  of  the  simulations  we  have  used  (where  N(u»a2)  — >  Gaussian) 
wfc  ~  N(0,0.05) 

vfc  ~  0.95N(0,1)  +  0.05N(0,51) 

0  ,  k  S  199 

*  10.0,  k  2  200 

N  -  window  length  -  15,  N  ■  N  ■  N  . 

s  z 

Initial  conditions  for  the  three  filters  tested  (conventional,  linear  adap¬ 
tive,  and  robust  adaptive)  are  the  same: 

A  _ 

xQ  -  0 

u0-° 

Q  -  0.05 
R  -  1.00 

Two  values  of  A  are  used,  0.1  and  0.9.  A  *  0.1  gives  a  wider  bandwidth 
system,  following  the  input  shape  more  closely.  The  noise-free  system  out¬ 
put  with  a  step  Au  input  at  k  ■  0  is 
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Au  ,,  -k£nAx 

\  “  ~ T  (1  "  e  > 

For  small  process  and  measurement  noises  compared  to  these  values,  errors 
In  estimates  of  x  and  predictions  of  z  after  the  step  will  be  initially  essen¬ 
tially  equal  to  the  values  of  x  until  the  estimate  u(k)  causes  the  adaptive 

filter  to  close  the  gap.  The  delay  with  the  robust  method  is  greater,  but  once 
the  switch  is  made  it  may  catch  up  faster  than  the  linear  adaptive  method  (see 

again  Figure  B2a,b,d).  When  the  x(k)  are  more  nearly  a  step  (this  is  true  when 
A  is  smaller),  the  advantage  of  robust  as  just  explained  matches  the  theory  and 
Figure  2  better.  For  any  A,  however,  the  robust  method  deals  with  the  non- 
Gaussian  noise  better. 

Another  aspect  considered  in  the  tests  is  the  use  of  the  factor 
c^  *  (1  -  ni;L2)2  per  Equations  (.32)  and  (30-31)  for  modifying  the  Kalman  gain 
when  a  new  measurement  is  deemed  "wild."  Another  delay  is  introduced  by  using 
c11,  but, except  near  the  step,  results  improve. 

One  measure  of  improvement  is  given  by  the  merit  formula 

M  »  a  2/(a~2  +  a  2)  (33) 

V  X  V 

so  that  as  the  error  variance  a~2  approaches  zero,M  approaches  unity.  This 
measure  allows  comparison  of  the  different  process  methods. 

We  rely  on  the  median  to  estimate  the  input  u^  per  Equation  (23) , 

\  ■  “dt%  -  • 

Observe  that  the;  residual  z^  -  Ax^_^  -  u^.^  will  be  nearly  u.  for  the  first 
(N  -  l)/2  samples  following  a  step.  This  gives  a  half -window  delay  (see 
Figure  B2-d)  in  trying  to  estimate  x^  and  trying  to  estimate 

eoo  -  MdUk  -  -  ak_x)  . 

(Although  we  do  not  find  r(k)  to  use  as  a  bias  on  z  (we  cannot  -  see  Appendix  A), 

A 

it  is  helpful  in  estimating  R  and  giving  a  feasible  K  near  the  step.) 


(35) 


[I'd  -  n  2) (l  -  5n  2)] [-1  +  l' (l  -  n  2)(i  -  5n.2)] 
j  3  3  J  3  3 


Because  of  the  primed- sums ,  only  those  terms  which  are  within  the  range  with 
(1  -  n^2)  >  0  are  used.  This  leads  to  "switching"  effects  in  Q  and  R.  The 

first  switch  occurs  when  the  large  residuals  become  the  majority;  another 
switch  occurs  when  the  state  estimates  catch  up  again  to  the  steady-state 
state  and  the  small  residuals  are  again  the  majority.  Some  of  these  effects 
can  be  minimized  by  refiltering  the  past  (N-l)/2  samples  immediately  upon  step 
detection,  but  that  feature  has  not  yet  been  implemented  in  our  work. 

Similar  effects  occur  in  the  conventional  linear  adaptive  formulas  for 

^  A  A,  /V 

r,  u,  Q  and  R,  but  because  of  the  linearity  more  gradual  errors  and  also  cor¬ 
rections  in  these  estimates  occur  and  peak  at  different  times.  Overall  system 
instabilities  might  be  analyzed  through  the  use  of  the  flow-graph  in  Figure  Al. 

Although  r^,  the  "unknown"  bias  on  z,  is  known  here  to  be  zero  and  is  not 
used  in  calculating  x^,  r^  is  used  in  calculating  R,  This  prohibits  R  from  get¬ 
ting  so  large  that  the  Kalman  filter  relies  almost  entirely  on  prediction.  Again, 
as  the  estimate  catches  up  to  the  true  state  after  the  step,  z^  and  Ax^_^  +  u^_^ 
are  on  the  same  order,  2^  is  nearly  zero  again  and  R  begins  to  approach  its  true 
value . 

Figure  1  shows  the  average  process  outputs;  curves  A  ■  measurement,  B  ■  state, 
C  ■  conventional  filter,  D  ■  linear  adaptive  filter  and  E  ■  robust  adaptive  filter. 
In  all  cases  except  for  A  -  .9,  c^  ■  (1  -  h2)2,  the  robust  estimate  recovers 
more  quickly  in  response  to  the  estimate.  In  Figure  Id,  the  combination  of  the 
output  state’s  slow  rise-time  (A  ■  .9)  and  delay  due  to  c^  *  (1  -  n2)2  causes 
the  robust  filter  to  have  considerable  delay;  however,  upon  recovery  its  merit 
is  still  superior  to  the  conventional  adaptive  system  because  of  its  ability  to 
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c)  A  -  0.9,  cu  -  1 
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d)  A  -  0.9,  e  -  (1  -  n!)2 


Figure  1.  Average  measurement  (A) ,  true  state  (B) ,  and  filter  estimates  (C) , 
linear  adaptive  (D) ,  and  robust  adaptive  (E) . 
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handle  Che  non-Gauss lan  measurement  noise.  Figure  of  meric  curves  for  Che  four 
cases  of  Figure  1  are  shown  in  Figure  2. 

In  Figure  2  (and  all  subsequent  Figures),  curves  A,  B,  C  are  respectively 
labels  of  conventional,  linear  adaptive,  and  robust  adaptive.  In  all  cases  the 
robust  method  returns  more  quickly  to  a  high  meric  racing.  For  both  the  fast 
and  slow  systems  (A  *  0.1  and  A  »  0.9)  the  gain  biweight  factor  c^  gives  an 
improved  transient  figure  of  merit  compared  to  the  unmodified  Kalman  gain  K. 
However  some  delay  is  the  cost  of  this  improvement.  That  is,  for  example  in 
the  A  ■  0.1  system  with  *  1,  recovery  occurs  at  about  k  ■  230  and  robust 
meric  rises  to  about  0.8  at  400  seconds.  But  with  *  (1  -  n2)2,  recovery 
occurs  approximately  at  k  ■  270,  yet  merit  quickly  rises  to  about  0.97. 

In  none  of  the  cases  does  the  average  linear  adaptive  merit  figure  rise 
to  more  than  about  0.25  by  k  ■  400. 

Figures  3a  and  3b  are  single-run  results  corresponding  to  Figures  lb 
and  d  respectively  (A  -  0.1,  c^  *  (1  -  C2)2;  and  A  -  0.9,  c^  ■  (1  -  n2)2), 
except  that  the  error  rather  than  actual  response  is  shown.  Note  how  in  the 
fast  system  (Figure  3a  )  the  robust  method  recovers  most  quickly,  as  expected 
from  the  ideal-step  theory;  but  in  the  slow  system  (Figure  3b)  the  compounded 
delays  of  the  robust  method  are  detrimental.  Averages  of  these  response  types 
are  given  in  Figures  3c  and  d.  These  figures  show  that  the  robust  average 
error  value  is  closer  to  zero  from  about  k  -  240  or  k  -  280,  accounting  for  its 
higher  merit.  The  non-Gaussian  errors  evident  in.  Curve  B  for  the  single-run  in 
Figure  3a  explain  this  improvement. 

Average  estimates  of  Q  and  R  are  shown  in  Figures  4  and  5  respectively. 
Considerable  variation  is  noted  as  parameters  vary.  However,  when  the  state 
step  is  large  (A  •  0.9),  it  is  clear  that  Q  follows  true  to  theory  in  that  dur- 

A 

ing  the  transition  robust  Q  (curve  C)  deviates  less  from  the  uncontaminated 
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Figure  3.  Error  plots:  (A)  conventional,  (B)  linear  adaptive,  (C)  robust. 
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Figure  4.  Average  Q:  (A)  conventional,  (B)  linear  adaptive,  (C)  robust. 
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Figure  5.  Average  R 


V 


process  variance  chan  does  Che  linear  Q  (curve  B).  Curve  A,  Che  convencional 
non-adapcive  Q,  of  course  is  always  correcC.  We  have  noc  yet  cested  for  Che 
situacion  where  true  variance  switches  values. 

The  essentially  unpredictable  effects  on  average  Kalman  gains,  excluding 
the  c^  factor  are  given  in  Figure  6.  The  correct  gain  for  true  Gaussian 
measurement  noise  and  no  step  is  in  curves  A,  and  these  are  constant.  Curves 
B  and  C  should  maintain  this  level  before  the  step  and  should  eventually  return 
to  this  level. 

In  Figure  7  are  the  estimates  of  the  forcing  function  for  the  four  para¬ 
meter  sets.  Again,  because  the  state  estimation  errors  in  the  fast  system 
(Figures  7a-b)  are  more  step-like,  the  robust  method  gives  a  good  improvement. 
Only  in  Figure  7d  is  the  delay  excessive,  and  even  then  curve  C  is  better 
after  recovery. 

V.  CONCLUSIONS 

Features  of  separate  previous  works  have  been  implemented  into  one  algo¬ 
rithm  to  estimate  states  in  systems  having  unknown  step  inputs  which  are  large 
with  respect  to  the  process  noise,  and  to  handle  non-Gaussian  measurement 
errors.  The  running  window  to  ol serve  signal  steps  in  non-Gaussian  noise  has 
been  robustified  by  the  use  of  medians  and  biweights.  These  statistics  have 
proven  useful  in  making  more  accurate  and  responsive  the  otherwise  conventional 
Kalman  filter.  A  figure  of  merit  which  compares  state  estimation  error  to 
measurement  error  has  been  introduced.  It  shows  quite  well  the  advantage  of 
the  robust  scheme  over  earlier  linear  adaptive  and  conventional  filterings. 

From  these  results  we  are  encouraged  to  seek  more  theoretically  optimal  robust 
techniques  and  to  experiment  with  higher  order  systems  and  other-than  step 
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Figure  6.  Kalman  gains,  excluding  :  (A)  convencional,  B)  linear  adaptive 

(C)  robust. 
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APPENDIX  A 


COMMENTS  ON  THE  ADAPTIVE  ALGORITHMS 


Myers  and  Tapley  [I1J  proposed  a  method  of  estimating  an  unknown  forcing 
constant  u(k)  *  u  and  any  bias  r  in  the  measurement  noise.  These  estimates 


are  (similar  to  Equations  10,  12-16) 

1  - 

u(k)  »  —  l  f(j) 

*  s  jsk-N  +1 


and 


-n  1  - 

r(k)  -  f-  l  y(j) 

2  j-k-N  -t-1 
J  2 


A-l 


A- 2 


where 


and 


f(k)  -  x(k)  -  Ax(k-l) 


A- 3 


y(k)  -  2(k)  -  H(Ax(k-l)  -  G(k-l)) 
The  covariance  estimators  were  given  to  be 


A- 4 


q  .  V 

x  v  -1  - 

‘ s  j-k-N  +1 
J  s 


(f(J)  -  u(k))(f(j)  -  u(k) ) 


1  k 

-  —  l  (AP(j-l)Ar  -  P(j))  A- 5 


s  j-k-N  +1 
J  s 


and 


R  -  T~T  l  (7(j)  -  r(k))(y(j)  -  r(k)T 

2  j-k-N  +1 
z 


,  k 

-  J-  l  HXAPCj-DA1  +  Q(j-1))HT  A-6 


2  j-k-N  +1 
z 


Groutage  [lo]  rederived  the  Q  and  R  estimators  but  used  f"(k)  - 
£(k)  -  Ax(k-l)  -  u(k-l)  -  x(k)  -  x(k)  in  Q.  The  analysis  in  [10]  leads  to 
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01 


.  K  _ 

5  "  FTT  I  C*'CJ>  ‘  S'Oc))(f'(j)  -  u"(k) )T 

V*  ■  1 


s  *  j-k-N  +1 

9 


-  IT  I  (AP(j-l)AT  -  P(J)>  , 


s  j-k-N  +1 
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where  u '(k)  is  Che  average  of  the  f '(j).  Averaging  100  simulations  has  shown 
Chat  this  estimator  leads  to  accurate  results,  Q  going  negative  only  rarely. 
(Myers  and  Tapley  used  | Q f  to  overcome  negative  estimates.  This  study  chooses 
instead  to  use  the  most  recent  positive  estimate) .  Note  that  A-l  must  still 
be  used  to  estimate  u,  the  step  input. 

The  biases  of  the  two  sample  covariance  estimators  are  the  same.  Also 
the  difference  in  terms. 


-  iT(k))  -  <f(j)  +  u(k)) 


1  * 

-u(j-l)  +  •=-  l  u(m-l)  , 

_ _ 1-  M  l  1 


s  m-k-N  +1 
s 

has  expected  value  equal  to  zero.  Thus  we  expect  that  the  difference  in 
variances  of  Q  by  A-5  and  by  A- 7  is  also  small  or  zero. 

Next  the  instability  of  the  variance  estimators  demonstrated  in  [11]  and 
remarked  on  in  ocher  literature  is  considered  here.  A  flowgraph  showing 
system  states,  the  Kalman  filter  and  estimators  for  r  and  u  is  given  in  Figure 
Al.  It  is  clear  that  the  constants  appearing  as  inputs  at  the  summing  junc¬ 
tion  whose  output  is  z(k)  cannot  be  distinguished  after  their  addition. 
Analysis  of  the  iterations  in  the  scalar  case  verifies  the  proportioning  of 


V  ^  V\  V 


gure  Al.  Myers  and  Tapley  Algorithm  for  estimation  of  input 
bias  u  and  measurement  bias  r. 


any  one  of  Che  biases  into  the  cvo  estimators.  This  erroneous  division  of  the 

A  A 

biases  leads  to  erroneous  Q  and  R  which  in  turn  contaminate  the  Kalman  gain  and 
P  matrices.  A  solution  to  this  difficulty  has  not  yet  been  proposed  other  than 
to  know  a  priori  one  of  the  biases  -  particularly  r,  as  we  are  particularly 
interested  in  unknown  step  inputs. 


APPENDIX  B 


ERROR  ANALYSIS  NEAR  STEPS 


The  adaptive  estimators  for  u,  r,  Q  and  R  formulated  in  Section  III  give 
rise  to  certain  errors.  Those  of  greatest  concern  here  are  the  transient 
errors  in  tracking  a  step  input  and  the  mean  square  errors  in  estimating  pro¬ 
cess  and  measurement  noise  variances.  Because  both  formulas  and  numerical 
data  are  available  in  the  literature  for  Gaussian  noise,  our  examples  will  be 
taken  therefrom.  Results  will  be  indicative  of  those  for  the  non-Gaussian 
case,  but  will  also  clarify  the  non-optimality  of  the  robust  procedures  when 
errors  are  indeed  Gaussian. 

First  consider  a  running  N-polnt  sample  median  w  on  data  w^  (Here  sequence 
is  denoted  by  subscripting).  Justusson  states  (in  [16])  that  the  approximate 
variance  if  the  w^  are  independent  and  identically  distributed  (i  i  d)  normal 
N(m,  0)  then 


r  * 
var[  w 


IT 


N  +  ir/ 2-1 


/2 


8-1 


which  is  57%  larger  than  that  of  the  sample  mean,  w. 
is  distributed  with  density  (Laplacian) 


f  (w) 


/T  |  w-m  |  / o 


However  if  the  noise 


B— 2 


then 


var[  w  ] 


oz/2  , 
N-l/2 


B-3 


which  is  SOX  smaller  than  the  variance  o2/N  of  the  mean.  Also  for  this 
density,  the  median  is  that  value  a  which  minimizes 


N 

min  Z  |w.-a|.  B-4 

a  i-1  1 

This  leads  to  believing  that  medians  are  useful  for  smoothing  noise  with 
heavy-tailed  densities.  A  testament  to  this  belief  is  found  in  [20], 
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wherein  ic  is  shown  that  as  noises  become  more  implusive  than  the  Laplacian, 
the  optimal  order-statistic  filter  quickly  approaches  the  median.  To 
begin  the  error  analysis,  let  a  window  of  length  N  straddle  a  step  in  the 
data  such  that  the  variables  w^,  Wj,  . have  density  N(0,a)  and  the 
variables  *  •••»  have  density  N(S,C),  and  assume  k  <  OH-D/2). 

For  small  S,  E{w}=E{3}  ■  Sk/n;  but  for  larger  S, 


w  -  w 


[  (N+l)/2,  N-k]. 


where  w[(N+]_)/2  N-k]  is  the  order  statistic  of  w^,  w^,  ...  w^_k 


Cui  s  wi+i)- 

The  mean 


"  C«9)  ■ 


of  w  is  then  bounded  by  [19]  OF  1  ^  ^N-ki-l^ ^  -  E(w 
where  F  is  the  cumulative  distribution  of  w. 


[  (N+l)/2,  N-k]‘ 


The  analytical  expression  for  the  variance  of  w  is  available  in  the 


literature  [19]  only  in  the  form 


a*  -  (5  -  e{w}) 2 


;  N-k 


(C)« 


where 


fN+I.N  k(w)  -  F*"l(ir)[l-F<w>  ]““*"■£  (w>  B-6 

m-  (N+D/2. 

We  may  thus  plot  E(w)/S  and  0^  vs  i,  for  various  S,  o  and  N,  where 

w 


(n(0,  a)  ,  i <  0 
'i  ^TlUS,  a)  ,  i>  1  . 


An  example  E(w}  and  w  vs  i  for  o*l,  S-5,  N-3  is  given  by  Justusson  (in 
[16]).  As  the  expected  error  in  tracking  the  step  with  a  running  median 
is  symmetrical  about  i  «  N/2  (considering  N  odd,  N/2  is  between  i  * 
and  — sp) ,  we  need  only  plot  the  expected  error,  E(w(j)}  where  1  £  j  £  . 
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This  Is  shown  in  FigursBl  for  several  values  of  N,  assuming  S  t  5a,  using 
data  from  Teichroev  [22].  Shown  in  Figure  B2  are  plots  of  w  and  3  for  a 
no-noise  situation,  and  E{q)  and  E{d2;  for  S»5,  0*1,  N»5.  The  results 
show  that  the  nedian  does  a  very  good  job  of  tracking  the  step  In 
comparison  to  the  mean.  This  feature  offers  great  advantage  over  conven¬ 
tional  adaptive  and  non-adaptive  methods  in  Kalman  filtering. 

In  Figure  32  two  advantages  of  the  robust  technique  are  evident.  First, 

the  step  of  mediw^}  is  a  delayed  but  otherwise  exact  replica  of  the  signal; 
the  overall  sun  squared  error  is  greater  by  a  factor  of  3N/  (2N-1) ,  as  will 
be  shown.  Secondly  the  running  linear  standard  deviation  measure  has 
an  approximate  peak  value  S//2 ,  whereas  the  peak  robust  Q^2  has  approximate 
value  2o  (for  '>5  and  S  >  So).  These  values  are  found  as  follows. 

The  sum-squared  error  of  the  running  sample  average  in  the  noise  free 
case  is 

N-l 

SS  -  Z  (S(i)-S)2 
4  i-1 

-  S2 (N-l) (2N-1)/ (6N)  b-8 

The  sum-squared  error  of  the  running  sample  median  in  the  noise-free 

case  is 

N-l 

SS  *  £  (w(i)-S)2 

a  i-1 

-  (N-l)S2/2  B-9 

The  degradation  factor  is  given  by 

SS  /SS  -  3N/(2N-1),  B-10 

a  a 

approximately  a  factor  of  3/2. 

The  mean  of  the  running  linear  variance  estimator,  d2,  in  the  transl- 

C* 

tion  region  1  <  i  S  N-l  for  the  low  noise  case  is  essentially  the  mean 
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Figure  B2.  Sequence  waveforms  of  running  location  and  variances  asti- 
mators;  (a)  noise-free  seep  in  data,  (b)  5-point  running  noise-free  aver¬ 
age,  (c)  5-point  running  nearly  noise-free  (S>5a)  mean  variance  estimate, 
(d)  5-point  running  noise-free  median,  (e)  5-point  running  robust  Q,  Si5a 


t 


square  of  the  error  S  -  w(j)  over  i-lH-1  <  j  £  i.  Considering  no  error  for 
j  i  0  or  j  >  N, 

E{3*  (i)}  *  TJ ~  S  (S  -  jS/N)2,  1  <  i  S  N-l 

j-1 

-  (S2/N2)i(N2-N(i+l)  +  (i+1)  (2i+l)/6)  B-ll 

The  peak  value  in  B-ll  occurs  at  i  ■  (N  ±  l)/2  and  it  is  S2/2.  Thus  the 
peak  standard  deviation  estimate  is  S/^2. 

A 

The  mean  of  the  robust  measure  of  variance,  Q,  is  plotted  in  Figure  B3, 
using  data  from  Teichroew  [20],  under  the  assumption  that  S/o  is  large.  This 
approximation  is  derived  as 'follows.  In  this  scalar  case  Q  is  of 

f\,  * 

Equation  26,  with  w(i)  representing  the  u^Ck),  k*i,  and  w(j)- representing 
x^k-j+l)  -  Ax^Ck-j ) .  Thus  at  time  i,  and  assuming  b^  negligible, 

'N  V  <v(j)  -  w(i))2(l-u;)" 

ElQ(i)}  -  E  — i - - -  )  b-12 

[V  (l-u2)(l-5u2)][-l  +  r  (1-u2)(1-5u2)] 
l  j  4  J  j  J  J  J 


where  Z'  only  includes  the  N  most  recent  data  for  which  ji2  £  1  and 

J  j 

y  -  "il)  -..yiU - - - 

1  6  “4(>w  -  “WUIj-im.  <  „  <  i  b-13 

In  the  noise  free  case,  aed{ [w(m)-w(i) j }  is  exactly  equal  to  zero  because 
the  majority  of  the  w(m)  in  the  window  are  exactly  equal  to  w(i).  Now  there 


are  i  values  nearly  equal  to  S  and  N-i  values  nearly  equal  to  0  within  the 
window,  and  let  i  £  (N-l)/2.  Then  med{ j w(m>-w(i) j }  is  very  small  w.*.  to 

<\j 

S  and  w(j)  -  w(i)  in  the  numerator  of  B-13  will  have  magnitudes  smaller 
than  6  med{  [w(m)-w(i)| }  only  for  N-i  of  the  w(j)  samples.  Thus  E{Q(i); 
reduces  for  large  S/a 
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Figure  B3.  Approximate  normalised  robust  variance  estimate  for  various 
window  sixes.  S/a  large,  using  data  from  [20] .  The  estimate  is  symmetric 


w*sij  A- -V  Jc  JuVAJt.Aj 


J>.  r»  ^U  .1*  Sa- Ji‘J*  It  m*v  .  < 


^(i»  -  froVar  f (wa>  •  "<1))2 

where  there  are  N-i  terms  in  the  sum.  Results  in  [20]  can  be  used  to  show 
that  this  can  be  written 

*««»  ■  Tt&m [1+ 

where  E(w2>  -  a2  +  E2{w>,  and  wp>  N_i  is  the  -  order  element  of  the  N-i 
w 

near-zero  valued  elements,  and  p*(N+l)/2  -  i,  1  £  i  S  (N-l)/2.  That  is,  p 
is  the  median  order  of  the  N  samples,  but  the  median  is  among  the  N-i 
samples  and  has  lesser  order  among  them. 

The  data  in  Figure  B3  is  a  plot  of  values  of  the  formula  B-15  for  various 
window  sizes  N.  The  plot  for  N*5  is  square-rooted  and  shown  in  Figure  2(e) . 
The  larger  the  window,  the  greater  the  ratio  S/a  required  for  accuracy  or 

r\j  Nil 

relevancy  of  these  plots.  The  peak  moment  E{v2  }  occurs  when  i  *  — ^ — * 

p.a-i  t 

as  it  is  the  2nd  moment  of  the  first  order  statistic  in  the  N-i  set.  For 
N-21  this  moment  is  7.7 a2.  As  the  corresponding  median,  shown  in  Figure  Bl, 
is  about  1.6a,  the  variance  of  the  N-i  samples  around  their  median  is  evi¬ 
dently  about  7.7-1.62,  or  a  standard  deviation  of  2.27 a.  If  we  require  5 
deviations  around  1.6a  to  be  less  than  S/2,  then  (1.6  +  5(2.27))a  <  S/2,  or 
S/a  >  26.  Such  a  requirement  is  of  course  only  necessary  for  the  approxi¬ 
mation  of  Equation  B-15  to  hold. 

We  note  that  any  such  estimate  using  Equation  ( 26)  with  pulse-like 
errors  as  in  Figure  B3  could  easily  be  smoothed  with  a  following  median 
filter.  It  is  well  known  that  median  filters  of  length  Nq  will  remove 
pulses  of  duration  (Nq-1)/2  or  less.  Of  course,  any  true  step  in  noise 
variance  would  be  detected,  but  at  the  cost  of  a  time  delay  equal  to  half 
the  length  of  the  following  filter. 
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